reprounzip-1.0.10/ 0000755 0000765 0000024 00000000000 13130663165 014420 5 ustar remram staff 0000000 0000000 reprounzip-1.0.10/LICENSE.txt 0000644 0000765 0000024 00000002746 13073250224 016246 0 ustar remram staff 0000000 0000000 Copyright (C) 2014-2017, New York University
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
reprounzip-1.0.10/MANIFEST.in 0000644 0000765 0000024 00000000047 13017600314 016146 0 ustar remram staff 0000000 0000000 include README.rst
include LICENSE.txt
reprounzip-1.0.10/PKG-INFO 0000644 0000765 0000024 00000005026 13130663165 015520 0 ustar remram staff 0000000 0000000 Metadata-Version: 1.1
Name: reprounzip
Version: 1.0.10
Summary: Linux tool enabling reproducible experiments (unpacker)
Home-page: http://vida-nyu.github.io/reprozip/
Author: Remi Rampin
Author-email: remirampin@gmail.com
License: BSD-3-Clause
Description: ReproZip
========
`ReproZip `__ is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
reprounzip
----------
This is the component responsible for the unpacking step on Linux distributions.
Please refer to `reprozip `__, `reprounzip-vagrant `_, and `reprounzip-docker `_ for other components and plugins.
A GUI is available at `reprounzip-qt `_.
Additional Information
----------------------
For more detailed information, please refer to our `website `_, as well as to our `documentation `_.
ReproZip is currently being developed at `NYU `_. The team includes:
* `Fernando Chirigati `_
* `Juliana Freire `_
* `Remi Rampin `_
* `Dennis Shasha `_
* `Vicky Steeves `_
Keywords: reprozip,reprounzip,reproducibility,provenance,vida,nyu
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: System :: Archiving
reprounzip-1.0.10/README.rst 0000644 0000765 0000024 00000003006 13127777600 016114 0 ustar remram staff 0000000 0000000 ReproZip
========
`ReproZip `__ is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
reprounzip
----------
This is the component responsible for the unpacking step on Linux distributions.
Please refer to `reprozip `__, `reprounzip-vagrant `_, and `reprounzip-docker `_ for other components and plugins.
A GUI is available at `reprounzip-qt `_.
Additional Information
----------------------
For more detailed information, please refer to our `website `_, as well as to our `documentation `_.
ReproZip is currently being developed at `NYU `_. The team includes:
* `Fernando Chirigati `_
* `Juliana Freire `_
* `Remi Rampin `_
* `Dennis Shasha `_
* `Vicky Steeves `_
reprounzip-1.0.10/reprounzip/ 0000755 0000765 0000024 00000000000 13130663165 016635 5 ustar remram staff 0000000 0000000 reprounzip-1.0.10/reprounzip/__init__.py 0000644 0000765 0000024 00000000320 13033760435 020741 0 ustar remram staff 0000000 0000000 try: # pragma: no cover
__import__('pkg_resources').declare_namespace(__name__)
except ImportError: # pragma: no cover
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
reprounzip-1.0.10/reprounzip/common.py 0000644 0000765 0000024 00000062572 13127776450 020523 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
# This file is shared:
# reprozip/reprozip/common.py
# reprounzip/reprounzip/common.py
"""Common functions between reprozip and reprounzip.
This module contains functions that are specific to the reprozip software and
its data formats, but that are shared between the reprozip and reprounzip
packages. Because the packages can be installed separately, these functions are
in a separate module which is duplicated between the packages.
As long as these are small in number, they are not worth putting in a separate
package that reprozip and reprounzip would both depend on.
"""
from __future__ import division, print_function, unicode_literals
import atexit
import contextlib
import copy
from datetime import datetime
from distutils.version import LooseVersion
import functools
import logging
import logging.handlers
import os
from rpaths import PosixPath, Path
import sys
import tarfile
import usagestats
import yaml
from .utils import iteritems, itervalues, unicode_, stderr, UniqueNames, \
escape, CommonEqualityMixin, optional_return_type, hsize, join_root, \
copyfile
FILE_READ = 0x01
FILE_WRITE = 0x02
FILE_WDIR = 0x04
FILE_STAT = 0x08
FILE_LINK = 0x10
class File(CommonEqualityMixin):
"""A file, used at some point during the experiment.
"""
comment = None
def __init__(self, path, size=None):
self.path = path
self.size = size
def __eq__(self, other):
return (isinstance(other, File) and
self.path == other.path)
def __hash__(self):
return hash(self.path)
class Package(CommonEqualityMixin):
"""A distribution package, containing a set of files.
"""
def __init__(self, name, version, files=None, packfiles=True, size=None):
self.name = name
self.version = version
self.files = list(files) if files is not None else []
self.packfiles = packfiles
self.size = size
def add_file(self, file_):
self.files.append(file_)
def __unicode__(self):
return '%s (%s)' % (self.name, self.version)
__str__ = __unicode__
# Pack format history:
# 1: used by reprozip 0.2 through 0.7. Single tar.gz file, metadata under
# METADATA/, data under DATA/
# 2: pack is usually not compressed, metadata under METADATA/, data in another
# DATA.tar.gz (files inside it still have the DATA/ prefix for ease-of-use
# in unpackers)
#
# Pack metadata history:
# 0.2: used by reprozip 0.2
# 0.2.1:
# config: comments directories as such in config
# trace database: adds executed_files.workingdir, adds processes.exitcode
# data: packs dynamic linkers
# 0.3:
# config: don't list missing (unpacked) files in config
# trace database: adds opened_files.is_directory
# 0.3.1: no change
# 0.3.2: no change
# 0.4:
# config: adds input_files, output_files, lists parent directories
# 0.4.1: no change
# 0.5: no change
# 0.6: no change
# 0.7:
# moves input_files and output_files from run to global scope
# adds processes.is_thread column to trace database
# 0.8: adds 'id' field to run
class RPZPack(object):
"""Encapsulates operations on the RPZ pack format.
"""
def __init__(self, pack):
self.pack = Path(pack)
self.tar = tarfile.open(str(self.pack), 'r:*')
f = self.tar.extractfile('METADATA/version')
version = f.read()
f.close()
if version.startswith(b'REPROZIP VERSION '):
try:
version = int(version[17:].rstrip())
except ValueError:
version = None
if version in (1, 2):
self.version = version
self.data_prefix = PosixPath(b'DATA')
else:
raise ValueError(
"Unknown format version %r (maybe you should upgrade "
"reprounzip? I only know versions 1 and 2" % version)
else:
raise ValueError("File doesn't appear to be a RPZ pack")
if self.version == 1:
self.data = self.tar
elif version == 2:
self.data = tarfile.open(
fileobj=self.tar.extractfile('DATA.tar.gz'),
mode='r:*')
else:
assert False
def remove_data_prefix(self, path):
if not isinstance(path, PosixPath):
path = PosixPath(path)
components = path.components[1:]
if not components:
return path.__class__('')
return path.__class__(*components)
def open_config(self):
"""Gets the configuration file.
"""
return self.tar.extractfile('METADATA/config.yml')
def extract_config(self, target):
"""Extracts the config to the specified path.
It is up to the caller to remove that file once done.
"""
member = copy.copy(self.tar.getmember('METADATA/config.yml'))
member.name = str(target.components[-1])
self.tar.extract(member,
path=str(Path.cwd() / target.parent))
target.chmod(0o644)
assert target.is_file()
@contextlib.contextmanager
def with_config(self):
"""Context manager that extracts the config to a temporary file.
"""
fd, tmp = Path.tempfile(prefix='reprounzip_')
os.close(fd)
self.extract_config(tmp)
yield tmp
tmp.remove()
def extract_trace(self, target):
"""Extracts the trace database to the specified path.
It is up to the caller to remove that file once done.
"""
target = Path(target)
if self.version == 1:
member = self.tar.getmember('METADATA/trace.sqlite3')
elif self.version == 2:
try:
member = self.tar.getmember('METADATA/trace.sqlite3.gz')
except KeyError:
member = self.tar.getmember('METADATA/trace.sqlite3')
else:
assert False
member = copy.copy(member)
member.name = str(target.components[-1])
self.tar.extract(member,
path=str(Path.cwd() / target.parent))
target.chmod(0o644)
assert target.is_file()
@contextlib.contextmanager
def with_trace(self):
"""Context manager that extracts the trace database to a temporary file.
"""
fd, tmp = Path.tempfile(prefix='reprounzip_')
os.close(fd)
self.extract_trace(tmp)
yield tmp
tmp.remove()
def list_data(self):
"""Returns tarfile.TarInfo objects for all the data paths.
"""
return [copy.copy(m)
for m in self.data.getmembers()
if m.name.startswith('DATA/')]
def data_filenames(self):
"""Returns a set of filenames for all the data paths.
Those paths begin with a slash / and the 'DATA' prefix has been
removed.
"""
return set(PosixPath(m.name[4:])
for m in self.data.getmembers()
if m.name.startswith('DATA/'))
def get_data(self, path):
"""Returns a tarfile.TarInfo object for the data path.
Raises KeyError if no such path exists.
"""
path = PosixPath(path)
path = join_root(PosixPath(b'DATA'), path)
return copy.copy(self.data.getmember(path))
def extract_data(self, root, members):
"""Extracts the given members from the data tarball.
The members must come from get_data().
"""
self.data.extractall(str(root), members)
def copy_data_tar(self, target):
"""Copies the file in which the data lies to the specified destination.
"""
if self.version == 1:
self.pack.copyfile(target)
elif self.version == 2:
with target.open('wb') as fp:
data = self.tar.extractfile('DATA.tar.gz')
copyfile(data, fp)
data.close()
def close(self):
if self.data is not self.tar:
self.data.close()
self.tar.close()
self.data = self.tar = None
class InvalidConfig(ValueError):
"""Configuration file is invalid.
"""
def read_files(files, File=File):
if files is None:
return []
return [File(PosixPath(f)) for f in files]
def read_packages(packages, File=File, Package=Package):
if packages is None:
return []
new_pkgs = []
for pkg in packages:
pkg['files'] = read_files(pkg['files'], File)
new_pkgs.append(Package(**pkg))
return new_pkgs
Config = optional_return_type(['runs', 'packages', 'other_files'],
['inputs_outputs', 'additional_patterns',
'format_version'])
@functools.total_ordering
class InputOutputFile(object):
def __init__(self, path, read_runs, write_runs):
self.path = path
self.read_runs = read_runs
self.write_runs = write_runs
def __eq__(self, other):
return ((self.path, self.read_runs, self.write_runs) ==
(other.path, other.read_runs, other.write_runs))
def __lt__(self, other):
return self.path < other.path
def __repr__(self):
return "" % (
self.path, self.read_runs, self.write_runs)
def load_iofiles(config, runs):
"""Loads the inputs_outputs part of the configuration.
This tests for duplicates, merge the lists of executions, and optionally
loads from the runs for reprozip < 0.7 compatibility.
"""
files_list = config.get('inputs_outputs') or []
# reprozip < 0.7 compatibility: read input_files and output_files from runs
if 'inputs_outputs' not in config:
for i, run in enumerate(runs):
for rkey, wkey in (('input_files', 'read_by_runs'),
('output_files', 'written_by_runs')):
for k, p in iteritems(run.pop(rkey, {})):
files_list.append({'name': k,
'path': p,
wkey: [i]})
files = {} # name:str: InputOutputFile
paths = {} # path:PosixPath: name:str
required_keys = set(['name', 'path'])
optional_keys = set(['read_by_runs', 'written_by_runs'])
uniquenames = UniqueNames()
for i, f in enumerate(files_list):
keys = set(f)
if (not keys.issubset(required_keys | optional_keys) or
not keys.issuperset(required_keys)):
raise InvalidConfig("File #%d has invalid keys")
name = f['name']
path = PosixPath(f['path'])
readers = sorted(f.get('read_by_runs', []))
writers = sorted(f.get('written_by_runs', []))
if name in files:
if files[name].path != path:
old_name, name = name, uniquenames(name)
logging.warning("File name appears multiple times: %s\n"
"Using name %s instead",
old_name, name)
else:
uniquenames.insert(name)
if path in paths:
if paths[path] == name:
logging.warning("File appears multiple times: %s", name)
else:
logging.warning("Two files have the same path (but different "
"names): %s, %s\nUsing name %s",
name, paths[path], paths[path])
name = paths[path]
files[name].read_runs.update(readers)
files[name].write_runs.update(writers)
else:
paths[path] = name
files[name] = InputOutputFile(path, readers, writers)
return files
def load_config(filename, canonical, File=File, Package=Package):
"""Loads a YAML configuration file.
`File` and `Package` parameters can be used to override the classes that
will be used to hold files and distribution packages; useful during the
packing step.
`canonical` indicates whether a canonical configuration file is expected
(in which case the ``additional_patterns`` section is not accepted). Note
that this changes the number of returned values of this function.
"""
with filename.open(encoding='utf-8') as fp:
config = yaml.safe_load(fp)
ver = LooseVersion(config['version'])
keys_ = set(config)
if 'version' not in keys_:
raise InvalidConfig("Missing version")
# Accepts versions from 0.2 to 0.8 inclusive
elif not LooseVersion('0.2') <= ver < LooseVersion('0.9'):
pkgname = (__package__ or __name__).split('.', 1)[0]
raise InvalidConfig("Loading configuration file in unknown format %s; "
"this probably means that you should upgrade "
"%s" % (ver, pkgname))
unknown_keys = keys_ - set(['pack_id', 'version', 'runs',
'inputs_outputs',
'packages', 'other_files',
'additional_patterns',
# Deprecated
'input_files', 'output_files'])
if unknown_keys:
logging.warning("Unrecognized sections in configuration: %s",
', '.join(unknown_keys))
runs = config.get('runs') or []
packages = read_packages(config.get('packages'), File, Package)
other_files = read_files(config.get('other_files'), File)
inputs_outputs = load_iofiles(config, runs)
# reprozip < 0.7 compatibility: set inputs/outputs on runs (for plugins)
for i, run in enumerate(runs):
run['input_files'] = dict((n, f.path)
for n, f in iteritems(inputs_outputs)
if i in f.read_runs)
run['output_files'] = dict((n, f.path)
for n, f in iteritems(inputs_outputs)
if i in f.write_runs)
# reprozip < 0.8 compatibility: assign IDs to runs
for i, run in enumerate(runs):
if run.get('id') is None:
run['id'] = "run%d" % i
record_usage_package(runs, packages, other_files,
inputs_outputs,
pack_id=config.get('pack_id'))
kwargs = {'format_version': ver,
'inputs_outputs': inputs_outputs}
if canonical:
if 'additional_patterns' in config:
raise InvalidConfig("Canonical configuration file shouldn't have "
"additional_patterns key anymore")
else:
kwargs['additional_patterns'] = config.get('additional_patterns') or []
return Config(runs, packages, other_files,
**kwargs)
def write_file(fp, fi, indent=0):
fp.write("%s - \"%s\"%s\n" % (
" " * indent,
escape(unicode_(fi.path)),
' # %s' % fi.comment if fi.comment is not None else ''))
def write_package(fp, pkg, indent=0):
indent_str = " " * indent
fp.write("%s - name: \"%s\"\n" % (indent_str, escape(pkg.name)))
fp.write("%s version: \"%s\"\n" % (indent_str, escape(pkg.version)))
if pkg.size is not None:
fp.write("%s size: %d\n" % (indent_str, pkg.size))
fp.write("%s packfiles: %s\n" % (indent_str, 'true' if pkg.packfiles
else 'false'))
fp.write("%s files:\n"
"%s # Total files used: %s\n" % (
indent_str, indent_str,
hsize(sum(fi.size
for fi in pkg.files
if fi.size is not None))))
if pkg.size is not None:
fp.write("%s # Installed package size: %s\n" % (
indent_str, hsize(pkg.size)))
for fi in sorted(pkg.files, key=lambda fi_: fi_.path):
write_file(fp, fi, indent + 1)
def save_config(filename, runs, packages, other_files, reprozip_version,
inputs_outputs=None,
canonical=False, pack_id=None):
"""Saves the configuration to a YAML file.
`canonical` indicates whether this is a canonical configuration file
(no ``additional_patterns`` section).
"""
dump = lambda x: yaml.safe_dump(x, encoding='utf-8', allow_unicode=True)
with filename.open('w', encoding='utf-8', newline='\n') as fp:
# Writes preamble
fp.write("""\
# ReproZip configuration file
# This file was generated by reprozip {version} at {date}
{what}
# Run info{pack_id}
version: "{format!s}"
""".format(pack_id=(('\npack_id: "%s"' % pack_id) if pack_id is not None
else ''),
version=escape(reprozip_version),
format='0.8',
date=datetime.now().isoformat(),
what=("# It was generated by the packer and you shouldn't need to "
"edit it" if canonical
else "# You might want to edit this file before running the "
"packer\n# See 'reprozip pack -h' for help")))
fp.write("runs:\n")
for i, run in enumerate(runs):
# Remove reprozip < 0.7 compatibility fields
run = dict((k, v) for k, v in iteritems(run)
if k not in ('input_files', 'output_files'))
fp.write("# Run %d\n" % i)
fp.write(dump([run]).decode('utf-8'))
fp.write("\n")
fp.write("""\
# Input and output files
# Inputs are files that are only read by a run; reprounzip can replace these
# files on demand to run the experiment with custom data.
# Outputs are files that are generated by a run; reprounzip can extract these
# files from the experiment on demand, for the user to examine.
# The name field is the identifier the user will use to access these files.
inputs_outputs:""")
for n, f in iteritems(inputs_outputs):
fp.write("""\
- name: {name}
path: {path}
written_by_runs: {writers}
read_by_runs: {readers}""".format(name=n, path=unicode_(f.path),
readers=repr(f.read_runs),
writers=repr(f.write_runs)))
fp.write("""\
# Files to pack
# All the files below were used by the program; they will be included in the
# generated package
# These files come from packages; we can thus choose not to include them, as it
# will simply be possible to install that package on the destination system
# They are included anyway by default
packages:
""")
# Writes files
for pkg in sorted(packages, key=lambda p: p.name):
write_package(fp, pkg)
fp.write("""\
# These files do not appear to come with an installed package -- you probably
# want them packed
other_files:
""")
for f in sorted(other_files, key=lambda fi: fi.path):
write_file(fp, f)
if not canonical:
fp.write("""\
# If you want to include additional files in the pack, you can list additional
# patterns of files that will be included
additional_patterns:
# Example:
# - /etc/apache2/** # Everything under apache2/
# - /var/log/apache2/*.log # Log files directly under apache2/
# - /var/lib/lxc/*/rootfs/home/**/*.py # All Python files of all users in
# # that container
""")
class LoggingDateFormatter(logging.Formatter):
"""Formatter that puts milliseconds in the timestamp.
"""
converter = datetime.fromtimestamp
def formatTime(self, record, datefmt=None):
ct = self.converter(record.created)
t = ct.strftime("%H:%M:%S")
s = "%s.%03d" % (t, record.msecs)
return s
def setup_logging(tag, verbosity):
"""Sets up the logging module.
"""
levels = [logging.CRITICAL, logging.WARNING, logging.INFO, logging.DEBUG]
console_level = levels[min(verbosity, 3)]
file_level = logging.INFO
min_level = min(console_level, file_level)
# Create formatter, with same format as C extension
fmt = "[%s] %%(asctime)s %%(levelname)s: %%(message)s" % tag
formatter = LoggingDateFormatter(fmt)
# Console logger
handler = logging.StreamHandler()
handler.setLevel(console_level)
handler.setFormatter(formatter)
# Set up logger
logger = logging.root
logger.setLevel(min_level)
logger.addHandler(handler)
# File logger
dotrpz = Path('~/.reprozip').expand_user()
try:
if not dotrpz.is_dir():
dotrpz.mkdir()
filehandler = logging.handlers.RotatingFileHandler(str(dotrpz / 'log'),
mode='a',
delay=False,
maxBytes=400000,
backupCount=5)
except (IOError, OSError):
logging.warning("Couldn't create log file %s", dotrpz / 'log')
else:
filehandler.setFormatter(formatter)
filehandler.setLevel(file_level)
logger.addHandler(filehandler)
filehandler.emit(logging.root.makeRecord(
__name__.split('.', 1)[0],
logging.INFO,
"(log start)", 0,
"Log opened %s %s",
(datetime.now().strftime("%Y-%m-%d"), sys.argv),
None))
_usage_report = None
def setup_usage_report(name, version):
"""Sets up the usagestats module.
"""
global _usage_report
certificate_file = get_reprozip_ca_certificate()
_usage_report = usagestats.Stats(
'~/.reprozip/usage_stats',
usagestats.Prompt(enable='%s usage_report --enable' % name,
disable='%s usage_report --disable' % name),
os.environ.get('REPROZIP_USAGE_URL',
'https://stats.reprozip.org/'),
version='%s %s' % (name, version),
unique_user_id=True,
env_var='REPROZIP_USAGE_STATS',
ssl_verify=certificate_file.path)
try:
os.getcwd().encode('ascii')
except (UnicodeEncodeError, UnicodeDecodeError):
record_usage(cwd_ascii=False)
else:
record_usage(cwd_ascii=True)
def enable_usage_report(enable):
"""Enables or disables usage reporting.
"""
if enable:
_usage_report.enable_reporting()
stderr.write("Thank you, usage reports will be sent automatically "
"from now on.\n")
else:
_usage_report.disable_reporting()
stderr.write("Usage reports will not be collected nor sent.\n")
def record_usage(**kwargs):
"""Records some info in the current usage report.
"""
if _usage_report is not None:
_usage_report.note(kwargs)
def record_usage_package(runs, packages, other_files,
inputs_outputs,
pack_id=None):
"""Records the info on some pack file into the current usage report.
"""
if _usage_report is None:
return
for run in runs:
record_usage(argv0=run['argv'][0])
record_usage(pack_id=pack_id or '',
nb_packages=len(packages),
nb_package_files=sum(len(pkg.files)
for pkg in packages),
packed_packages=sum(1 for pkg in packages
if pkg.packfiles),
nb_other_files=len(other_files),
nb_input_outputs_files=len(inputs_outputs),
nb_input_files=sum(1 for f in itervalues(inputs_outputs)
if f.read_runs),
nb_output_files=sum(1 for f in itervalues(inputs_outputs)
if f.write_runs))
def submit_usage_report(**kwargs):
"""Submits the current usage report to the usagestats server.
"""
_usage_report.submit(kwargs,
usagestats.OPERATING_SYSTEM,
usagestats.SESSION_TIME,
usagestats.PYTHON_VERSION)
def get_reprozip_ca_certificate():
"""Gets the ReproZip CA certificate filename.
"""
fd, certificate_file = Path.tempfile(prefix='rpz_stats_ca_', suffix='.pem')
with certificate_file.open('wb') as fp:
fp.write(usage_report_ca)
os.close(fd)
atexit.register(os.remove, certificate_file.path)
return certificate_file
usage_report_ca = b'''\
-----BEGIN CERTIFICATE-----
MIIDzzCCAregAwIBAgIJAMmlcDnTidBEMA0GCSqGSIb3DQEBCwUAMH4xCzAJBgNV
BAYTAlVTMREwDwYDVQQIDAhOZXcgWW9yazERMA8GA1UEBwwITmV3IFlvcmsxDDAK
BgNVBAoMA05ZVTERMA8GA1UEAwwIUmVwcm9aaXAxKDAmBgkqhkiG9w0BCQEWGXJl
cHJvemlwLWRldkB2Z2MucG9seS5lZHUwHhcNMTQxMTA3MDUxOTA5WhcNMjQxMTA0
MDUxOTA5WjB+MQswCQYDVQQGEwJVUzERMA8GA1UECAwITmV3IFlvcmsxETAPBgNV
BAcMCE5ldyBZb3JrMQwwCgYDVQQKDANOWVUxETAPBgNVBAMMCFJlcHJvWmlwMSgw
JgYJKoZIhvcNAQkBFhlyZXByb3ppcC1kZXZAdmdjLnBvbHkuZWR1MIIBIjANBgkq
hkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA1fuTW2snrVji51vGVl9hXAAZbNJ+dxG+
/LOOxZrF2f1RRNy8YWpeCfGbsZqiIEjorBv8lvdd9P+tD3M5sh9L0zQPU9dFvDb+
OOrV0jx59hbK3QcCQju3YFuAtD1lu8TBIPgGEab0eJhLVIX+XU5cYXrfoBmwCpN/
1wXWkUhN91ZVMA0ylATAxTpnoNuMKzfTxT8pyOWajiTskYkKmVBAxgYJQe1YDFA8
fglBNkQuHqP8jgYAniEBCAPZRMMq8WpOtyFx+L9LX9/WcHtAQyDPPb9M81KKgPQq
urtCqtuDKxuqcX9zg4/O8l4nZ50pwaJjbH4kMW/wnLzTPvzZCPtJYQIDAQABo1Aw
TjAdBgNVHQ4EFgQUJjhDDOup4P0cdrAVq1F9ap3yTj8wHwYDVR0jBBgwFoAUJjhD
DOup4P0cdrAVq1F9ap3yTj8wDAYDVR0TBAUwAwEB/zANBgkqhkiG9w0BAQsFAAOC
AQEAeKpTiy2WYPqevHseTCJDIL44zghDJ9w5JmECOhFgPXR9Hl5Nh9S1j4qHBs4G
cn8d1p2+8tgcJpNAysjuSl4/MM6hQNecW0QVqvJDQGPn33bruMB4DYRT5du1Zpz1
YIKRjGU7Of3CycOCbaT50VZHhEd5GS2Lvg41ngxtsE8JKnvPuim92dnCutD0beV+
4TEvoleIi/K4AZWIaekIyqazd0c7eQjgSclNGgePcdbaxIo0u6tmdTYk3RNzo99t
DCfXxuMMg3wo5pbqG+MvTdECaLwt14zWU259z8JX0BoeVG32kHlt2eUpm5PCfxqc
dYuwZmAXksp0T0cWo0DnjJKRGQ==
-----END CERTIFICATE-----
'''
reprounzip-1.0.10/reprounzip/main.py 0000644 0000765 0000024 00000011675 13127776450 020155 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Entry point for the reprounzip utility.
This contains :func:`~reprounzip.reprounzip.main`, which is the entry point
declared to setuptools. It is also callable directly.
It dispatchs to plugins registered through pkg_resources as entry point
``reprounzip.unpackers``.
"""
from __future__ import division, print_function, unicode_literals
if __name__ == '__main__': # noqa
from reprounzip.main import main
main()
import argparse
import locale
import logging
from pkg_resources import iter_entry_points
import sys
import traceback
from reprounzip.common import setup_logging, \
setup_usage_report, enable_usage_report, \
submit_usage_report, record_usage
from reprounzip import signals
from reprounzip.unpackers.common import UsageError
__version__ = '1.0.10'
unpackers = {}
def get_plugins(entry_point_name):
for entry_point in iter_entry_points(entry_point_name):
try:
func = entry_point.load()
except Exception:
print("Plugin %s from %s %s failed to initialize!" % (
entry_point.name,
entry_point.dist.project_name, entry_point.dist.version),
file=sys.stderr)
traceback.print_exc(file=sys.stderr)
continue
name = entry_point.name
# Docstring is used as description (used for detailed help)
descr = func.__doc__.strip()
# First line of docstring is the help (used for general help)
descr_1 = descr.split('\n', 1)[0]
yield name, func, descr, descr_1
class RPUZArgumentParser(argparse.ArgumentParser):
def error(self, message):
sys.stderr.write('error: %s\n' % message)
self.print_help(sys.stderr)
sys.exit(2)
def usage_report(args):
if bool(args.enable) == bool(args.disable):
logging.critical("What do you want to do?")
raise UsageError
enable_usage_report(args.enable)
sys.exit(0)
def main():
"""Entry point when called on the command-line.
"""
# Locale
locale.setlocale(locale.LC_ALL, '')
# Parses command-line
# General options
def add_options(opts):
opts.add_argument('--version', action='version',
version="reprounzip version %s" % __version__)
# Loads plugins
for name, func, descr, descr_1 in get_plugins('reprounzip.plugins'):
func()
parser = RPUZArgumentParser(
description="reprounzip is the ReproZip component responsible for "
"unpacking and reproducing an experiment previously "
"packed with reprozip",
epilog="Please report issues to reprozip-users@vgc.poly.edu")
add_options(parser)
parser.add_argument('-v', '--verbose', action='count', default=1,
dest='verbosity',
help="augments verbosity level")
subparsers = parser.add_subparsers(title="subcommands", metavar='')
# usage_report subcommand
parser_stats = subparsers.add_parser(
'usage_report',
help="Enables or disables anonymous usage reports")
add_options(parser_stats)
parser_stats.add_argument('--enable', action='store_true')
parser_stats.add_argument('--disable', action='store_true')
parser_stats.set_defaults(func=usage_report)
# Loads unpackers
for name, func, descr, descr_1 in get_plugins('reprounzip.unpackers'):
plugin_parser = subparsers.add_parser(
name, help=descr_1, description=descr,
formatter_class=argparse.RawDescriptionHelpFormatter)
add_options(plugin_parser)
info = func(plugin_parser)
plugin_parser.set_defaults(selected_unpacker=name)
if info is None:
info = {}
unpackers[name] = info
signals.pre_parse_args(parser=parser, subparsers=subparsers)
args = parser.parse_args()
signals.post_parse_args(args=args)
if getattr(args, 'func', None) is None:
parser.print_help(sys.stderr)
sys.exit(2)
signals.unpacker = getattr(args, 'selected_unpacker', None)
setup_logging('REPROUNZIP', args.verbosity)
setup_usage_report('reprounzip', __version__)
if hasattr(args, 'selected_unpacker'):
record_usage(unpacker=args.selected_unpacker)
signals.pre_setup.subscribe(lambda **kw: record_usage(setup=True))
signals.pre_run.subscribe(lambda **kw: record_usage(run=True))
try:
try:
args.func(args)
except UsageError:
raise
except Exception as e:
signals.application_finishing(reason=e)
submit_usage_report(result=type(e).__name__)
raise
else:
signals.application_finishing(reason=None)
except UsageError:
parser.print_help(sys.stderr)
sys.exit(2)
submit_usage_report(result='success')
sys.exit(0)
reprounzip-1.0.10/reprounzip/orderedset.py 0000644 0000765 0000024 00000005537 13073250224 021353 0 ustar remram staff 0000000 0000000 # From http://code.activestate.com/recipes/576694/
# With added update()
# Copyright (C) 2009 Raymond Hettinger
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
import collections
class OrderedSet(collections.MutableSet):
def __init__(self, iterable=None):
self.end = end = []
end += [None, end, end] # sentinel node for doubly linked list
self.map = {} # key --> [key, prev, next_]
if iterable is not None:
self |= iterable
def __len__(self):
return len(self.map)
def __contains__(self, key):
return key in self.map
def add(self, key):
if key not in self.map:
end = self.end
curr = end[1]
curr[2] = end[1] = self.map[key] = [key, curr, end]
def discard(self, key):
if key in self.map:
key, prev, next_ = self.map.pop(key)
prev[2] = next_
next_[1] = prev
def __iter__(self):
end = self.end
curr = end[2]
while curr is not end:
yield curr[0]
curr = curr[2]
def __reversed__(self):
end = self.end
curr = end[1]
while curr is not end:
yield curr[0]
curr = curr[1]
def pop(self, last=True):
if not self:
raise KeyError('set is empty')
key = self.end[1][0] if last else self.end[2][0]
self.discard(key)
return key
def __repr__(self):
if not self:
return '%s()' % (self.__class__.__name__,)
return '%s(%r)' % (self.__class__.__name__, list(self))
def __eq__(self, other):
if isinstance(other, OrderedSet):
return len(self) == len(other) and list(self) == list(other)
return set(self) == set(other)
def update(self, iterable):
for key in iterable:
self.add(key)
reprounzip-1.0.10/reprounzip/pack_info.py 0000644 0000765 0000024 00000033723 13127722141 021144 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Entry point for the reprounzip utility.
This contains :func:`~reprounzip.reprounzip.main`, which is the entry point
declared to setuptools. It is also callable directly.
It dispatchs to plugins registered through pkg_resources as entry point
``reprounzip.unpackers``.
"""
from __future__ import division, print_function, unicode_literals
import argparse
import json
import logging
import platform
from rpaths import Path
import sys
from reprounzip.common import RPZPack, load_config as load_config_file
from reprounzip.main import unpackers
from reprounzip.unpackers.common import load_config, COMPAT_OK, COMPAT_MAYBE, \
COMPAT_NO, UsageError, shell_escape, metadata_read
from reprounzip.utils import iteritems, itervalues, unicode_, hsize
def get_package_info(pack, read_data=False):
"""Get information about a package.
"""
runs, packages, other_files = config = load_config(pack)
inputs_outputs = config.inputs_outputs
information = {}
if read_data:
total_size = 0
total_paths = 0
files = 0
dirs = 0
symlinks = 0
hardlinks = 0
others = 0
rpz_pack = RPZPack(pack)
for m in rpz_pack.list_data():
total_size += m.size
total_paths += 1
if m.isfile():
files += 1
elif m.isdir():
dirs += 1
elif m.issym():
symlinks += 1
elif hasattr(m, 'islnk') and m.islnk():
hardlinks += 1
else:
others += 1
rpz_pack.close()
information['pack'] = {
'total_size': total_size,
'total_paths': total_paths,
'files': files,
'dirs': dirs,
'symlinks': symlinks,
'hardlinks': hardlinks,
'others': others,
}
total_paths = 0
packed_packages_files = 0
unpacked_packages_files = 0
packed_packages = 0
for package in packages:
nb = len(package.files)
total_paths += nb
if package.packfiles:
packed_packages_files += nb
packed_packages += 1
else:
unpacked_packages_files += nb
nb = len(other_files)
total_paths += nb
information['meta'] = {
'total_paths': total_paths,
'packed_packages_files': packed_packages_files,
'unpacked_packages_files': unpacked_packages_files,
'packages': len(packages),
'packed_packages': packed_packages,
'packed_paths': packed_packages_files + nb,
}
if runs:
architecture = runs[0]['architecture']
if any(r['architecture'] != architecture
for r in runs):
logging.warning("Runs have different architectures")
information['meta']['architecture'] = architecture
distribution = runs[0]['distribution']
if any(r['distribution'] != distribution
for r in runs):
logging.warning("Runs have different distributions")
information['meta']['distribution'] = distribution
information['runs'] = [
dict((k, run[k])
for k in ['id', 'binary', 'argv', 'environ',
'workingdir', 'signal', 'exitcode']
if k in run)
for run in runs]
information['inputs_outputs'] = {
name: {'path': str(iofile.path),
'read_runs': iofile.read_runs,
'write_runs': iofile.write_runs}
for name, iofile in iteritems(inputs_outputs)}
# Unpacker compatibility
unpacker_status = {}
for name, upk in iteritems(unpackers):
if 'test_compatibility' in upk:
compat = upk['test_compatibility']
if callable(compat):
compat = compat(pack, config=config)
if isinstance(compat, (tuple, list)):
compat, msg = compat
else:
msg = None
unpacker_status.setdefault(compat, []).append((name, msg))
else:
unpacker_status.setdefault(None, []).append((name, None))
information['unpacker_status'] = unpacker_status
return information
def _print_package_info(pack, info, verbosity=1):
print("Pack file: %s" % pack)
print("\n----- Pack information -----")
print("Compressed size: %s" % hsize(pack.size()))
info_pack = info.get('pack')
if info_pack:
if 'total_size' in info_pack:
print("Unpacked size: %s" % hsize(info_pack['total_size']))
if 'total_paths' in info_pack:
print("Total packed paths: %d" % info_pack['total_paths'])
if verbosity >= 3:
print(" Files: %d" % info_pack['files'])
print(" Directories: %d" % info_pack['dirs'])
if info_pack.get('symlinks'):
print(" Symbolic links: %d" % info_pack['symlinks'])
if info_pack.get('hardlinks'):
print(" Hard links: %d" % info_pack['hardlinks'])
if info_pack.get('others'):
print(" Unknown (what!?): %d" % info_pack['others'])
print("\n----- Metadata -----")
info_meta = info['meta']
if verbosity >= 3:
print("Total paths: %d" % info_meta['total_paths'])
print("Listed packed paths: %d" % info_meta['packed_paths'])
if info_meta.get('packages'):
print("Total software packages: %d" % info_meta['packages'])
print("Packed software packages: %d" % info_meta['packed_packages'])
if verbosity >= 3:
print("Files from packed software packages: %d" %
info_meta['packed_packages_files'])
print("Files from unpacked software packages: %d" %
info_meta['unpacked_packages_files'])
if 'architecture' in info_meta:
print("Architecture: %s (current: %s)" % (info_meta['architecture'],
platform.machine().lower()))
if 'distribution' in info_meta:
distribution = ' '.join(t for t in info_meta['distribution'] if t)
current_distribution = platform.linux_distribution()[0:2]
current_distribution = ' '.join(t for t in current_distribution if t)
print("Distribution: %s (current: %s)" % (
distribution, current_distribution or "(not Linux)"))
if 'runs' in info:
runs = info['runs']
print("Runs (%d):" % len(runs))
for run in runs:
cmdline = ' '.join(shell_escape(a) for a in run['argv'])
if len(runs) == 1 and run['id'] == "run0":
print(" %s" % cmdline)
else:
print(" %s: %s" % (run['id'], cmdline))
if verbosity >= 2:
print(" wd: %s" % run['workingdir'])
if 'signal' in run:
print(" signal: %d" % run['signal'])
else:
print(" exitcode: %d" % run['exitcode'])
inputs_outputs = info.get('inputs_outputs')
if inputs_outputs:
if verbosity < 2:
print("Inputs/outputs files (%d): %s" % (
len(inputs_outputs), ", ".join(sorted(inputs_outputs))))
else:
print("Inputs/outputs files (%d):" % len(inputs_outputs))
for name, f in sorted(iteritems(inputs_outputs)):
t = []
if f['read_runs']:
t.append("in")
if f['write_runs']:
t.append("out")
print(" %s (%s): %s" % (name, ' '.join(t), f['path']))
unpacker_status = info.get('unpacker_status')
if unpacker_status:
print("\n----- Unpackers -----")
for s, n in [(COMPAT_OK, "Compatible"), (COMPAT_MAYBE, "Unknown"),
(COMPAT_NO, "Incompatible")]:
if s != COMPAT_OK and verbosity < 2:
continue
if s not in unpacker_status:
continue
upks = unpacker_status[s]
print("%s (%d):" % (n, len(upks)))
for upk_name, msg in upks:
if msg is not None:
print(" %s (%s)" % (upk_name, msg))
else:
print(" %s" % upk_name)
def print_info(args):
"""Writes out some information about a pack file.
"""
pack = Path(args.pack[0])
info = get_package_info(pack, read_data=args.json or args.verbosity >= 2)
if args.json:
json.dump(info, sys.stdout, indent=2)
sys.stdout.write('\n')
else:
_print_package_info(pack, info, args.verbosity)
def showfiles(args):
"""Writes out the input and output files.
Works both for a pack file and for an extracted directory.
"""
def parse_run(runs, s):
for i, run in enumerate(runs):
if run['id'] == s:
return i
try:
r = int(s)
except ValueError:
logging.critical("Error: Unknown run %s", s)
raise UsageError
if r < 0 or r >= len(runs):
logging.critical("Error: Expected 0 <= run <= %d, got %d",
len(runs) - 1, r)
sys.exit(1)
return r
show_inputs = args.input or not args.output
show_outputs = args.output or not args.input
def file_filter(fio):
if file_filter.run is None:
return ((show_inputs and fio.read_runs) or
(show_outputs and fio.write_runs))
else:
return ((show_inputs and file_filter.run in fio.read_runs) or
(show_outputs and file_filter.run in fio.write_runs))
file_filter.run = None
pack = Path(args.pack[0])
if not pack.exists():
logging.critical("Pack or directory %s does not exist", pack)
sys.exit(1)
if pack.is_dir():
# Reads info from an unpacked directory
config = load_config_file(pack / 'config.yml',
canonical=True)
# Filter files by run
if args.run is not None:
file_filter.run = parse_run(config.runs, args.run)
# The '.reprounzip' file is a pickled dictionary, it contains the name
# of the files that replaced each input file (if upload was used)
unpacked_info = metadata_read(pack, None)
assigned_input_files = unpacked_info.get('input_files', {})
if show_inputs:
shown = False
for input_name, f in sorted(iteritems(config.inputs_outputs)):
if f.read_runs and file_filter(f):
if not shown:
print("Input files:")
shown = True
if args.verbosity >= 2:
print(" %s (%s)" % (input_name, f.path))
else:
print(" %s" % input_name)
assigned = assigned_input_files.get(input_name)
if assigned is None:
assigned = "(original)"
elif assigned is False:
assigned = "(not created)"
elif assigned is True:
assigned = "(generated)"
else:
assert isinstance(assigned, (bytes, unicode_))
print(" %s" % assigned)
if not shown:
print("Input files: none")
if show_outputs:
shown = False
for output_name, f in sorted(iteritems(config.inputs_outputs)):
if f.write_runs and file_filter(f):
if not shown:
print("Output files:")
shown = True
if args.verbosity >= 2:
print(" %s (%s)" % (output_name, f.path))
else:
print(" %s" % output_name)
if not shown:
print("Output files: none")
else: # pack.is_file()
# Reads info from a pack file
config = load_config(pack)
# Filter files by run
if args.run is not None:
file_filter.run = parse_run(config.runs, args.run)
if any(f.read_runs for f in itervalues(config.inputs_outputs)):
print("Input files:")
for input_name, f in sorted(iteritems(config.inputs_outputs)):
if f.read_runs and file_filter(f):
if args.verbosity >= 2:
print(" %s (%s)" % (input_name, f.path))
else:
print(" %s" % input_name)
else:
print("Input files: none")
if any(f.write_runs for f in itervalues(config.inputs_outputs)):
print("Output files:")
for output_name, f in sorted(iteritems(config.inputs_outputs)):
if f.write_runs and file_filter(f):
if args.verbosity >= 2:
print(" %s (%s)" % (output_name, f.path))
else:
print(" %s" % output_name)
else:
print("Output files: none")
def setup_info(parser, **kwargs):
"""Prints out some information about a pack
"""
parser.add_argument('pack', nargs=1,
help="Pack to read")
parser.add_argument('--json', action='store_true', default=False)
parser.set_defaults(func=print_info)
def setup_showfiles(parser, **kwargs):
"""Prints out input and output file names
"""
parser.add_argument('pack', nargs=1,
help="Pack or directory to read from")
parser.add_argument('run', nargs=argparse.OPTIONAL,
help="Run whose input and output files will be listed")
parser.add_argument('--input', action='store_true',
help="Only show input files")
parser.add_argument('--output', action='store_true',
help="Only show output files")
parser.set_defaults(func=showfiles)
reprounzip-1.0.10/reprounzip/parameters.py 0000644 0000765 0000024 00000044036 13130214224 021345 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Retrieve parameters from online source.
Most unpackers require some parameters that are likely to change on a different
schedule from ReproZip's releases. To account for that, ReproZip downloads a
"parameter file", which is just a JSON with a bunch of parameters.
In there you will find things like the address of some binaries that are
downloaded from the web (rpzsudo and busybox), and the name of Vagrant boxes
and Docker images for various operating systems.
"""
from __future__ import division, print_function, unicode_literals
from distutils.version import LooseVersion
import json
import logging
import os
from reprounzip.common import get_reprozip_ca_certificate
from reprounzip.utils import download_file
parameters = None
def update_parameters():
"""Try to download a new version of the parameter file.
"""
global parameters
if parameters is not None:
return
url = 'https://stats.reprozip.org/parameters/'
env_var = os.environ.get('REPROZIP_PARAMETERS')
if env_var and (
env_var.startswith('http://') or env_var.startswith('https://')):
# This is only used for testing
# Note that this still expects the ReproZip CA
url = env_var
elif env_var not in (None, '', '1', 'on', 'enabled', 'yes', 'true'):
parameters = json.loads(bundled_parameters)
return
try:
from reprounzip.main import __version__ as version
filename = download_file(
'%s%s' % (url, version),
None,
cachename='parameters.json',
ssl_verify=get_reprozip_ca_certificate().path)
except Exception:
logging.info("Can't download parameters.json, using bundled "
"parameters")
else:
try:
with filename.open() as fp:
parameters = json.load(fp)
except ValueError:
logging.info("Downloaded parameters.json doesn't load, using "
"bundled parameters")
try:
filename.remove()
except OSError:
pass
else:
ver = LooseVersion(parameters.get('version', '1.0'))
if LooseVersion('1.0') <= ver < LooseVersion('1.1'):
return
else:
logging.info("parameters.json has incompatible version %s, "
"using bundled parameters", ver)
parameters = json.loads(bundled_parameters)
def get_parameter(section):
"""Get a parameter from the downloaded or default parameter file.
"""
if parameters is None:
update_parameters()
return parameters.get(section, None)
bundled_parameters = (
'{\n'
' "busybox_url": {\n'
' "x86_64": "https://s3.amazonaws.com/reprozip-files/busybox-x86_64",\n'
' "i686": "https://s3.amazonaws.com/reprozip-files/busybox-i686"\n'
' },\n'
' "rpzsudo_url": {\n'
' "x86_64": "https://github.com/remram44/static-sudo/releases/download/'
'current/rpzsudo-x86_64",\n'
' "i686": "https://github.com/remram44/static-sudo/releases/download/cu'
'rrent/rpzsudo-i686"\n'
' },\n'
' "docker_images": {\n'
' "default": "debian",\n'
' "images": {\n'
' "ubuntu": {\n'
' "versions": [\n'
' {\n'
' "version": "^12\\\\.04$",\n'
' "distribution": "ubuntu",\n'
' "image": "ubuntu:12.04",\n'
' "name": "Ubuntu 12.04 \'Precise\'"\n'
' },\n'
' {\n'
' "version": "^14\\\\.04$",\n'
' "distribution": "ubuntu",\n'
' "image": "ubuntu:14.04",\n'
' "name": "Ubuntu 14.04 \'Trusty\'"\n'
' },\n'
' {\n'
' "version": "^14\\\\.10$",\n'
' "distribution": "ubuntu",\n'
' "image": "ubuntu:14.10",\n'
' "name": "Ubuntu 14.10 \'Utopic\'"\n'
' },\n'
' {\n'
' "version": "^15\\\\.04$",\n'
' "distribution": "ubuntu",\n'
' "image": "ubuntu:15.04",\n'
' "name": "Ubuntu 15.04 \'Vivid\'"\n'
' },\n'
' {\n'
' "version": "^15\\\\.10$",\n'
' "distribution": "ubuntu",\n'
' "image": "ubuntu:15.10",\n'
' "name": "Ubuntu 15.10 \'Wily\'"\n'
' },\n'
' {\n'
' "version": "^16\\\\.04$",\n'
' "distribution": "ubuntu",\n'
' "image": "ubuntu:16.04",\n'
' "name": "Ubuntu 16.04 \'Xenial\'"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "ubuntu",\n'
' "image": "ubuntu:15.10",\n'
' "name": "Ubuntu 15.10 \'Wily\'"\n'
' }\n'
' },\n'
' "debian": {\n'
' "versions": [\n'
' {\n'
' "version": "^(6(\\\\.|$))|(squeeze)",\n'
' "distribution": "debian",\n'
' "image": "debian:squeeze",\n'
' "name": "Debian 6 \'Squeeze\'"\n'
' },\n'
' {\n'
' "version": "^(7(\\\\.|$))|(wheezy)",\n'
' "distribution": "debian",\n'
' "image": "debian:wheezy",\n'
' "name": "Debian 7 \'Wheezy\'"\n'
' },\n'
' {\n'
' "version": "^(8(\\\\.|$))|(jessie)",\n'
' "distribution": "debian",\n'
' "image": "debian:jessie",\n'
' "name": "Debian 8 \'Jessie\'"\n'
' },\n'
' {\n'
' "version": "^(9(\\\\.|$))|(stretch)",\n'
' "distribution": "debian",\n'
' "image": "debian:stretch",\n'
' "name": "Debian 9 \'Stretch\'"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "debian",\n'
' "image": "debian:jessie",\n'
' "name": "Debian 8 \'Jessie\'"\n'
' }\n'
' },\n'
' "centos": {\n'
' "versions": [\n'
' {\n'
' "version": "^5(\\\\.|$)",\n'
' "distribution": "centos",\n'
' "image": "centos:centos5",\n'
' "name": "CentOS 5"\n'
' },\n'
' {\n'
' "version": "^6(\\\\.|$)",\n'
' "distribution": "centos",\n'
' "image": "centos:centos6",\n'
' "name": "CentOS 6"\n'
' },\n'
' {\n'
' "version": "^7(\\\\.|$)",\n'
' "distribution": "centos",\n'
' "image": "centos:centos7",\n'
' "name": "CentOS 7"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "centos",\n'
' "image": "centos:centos7",\n'
' "name": "CentOS 7"\n'
' }\n'
' },\n'
' "centos linux": {\n'
' "versions": [\n'
' {\n'
' "version": "^5(\\\\.|$)",\n'
' "distribution": "centos",\n'
' "image": "centos:centos5",\n'
' "name": "CentOS 5"\n'
' },\n'
' {\n'
' "version": "^6(\\\\.|$)",\n'
' "distribution": "centos",\n'
' "image": "centos:centos6",\n'
' "name": "CentOS 6"\n'
' },\n'
' {\n'
' "version": "^7(\\\\.|$)",\n'
' "distribution": "centos",\n'
' "image": "centos:centos7",\n'
' "name": "CentOS 7"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "centos",\n'
' "image": "centos:centos7",\n'
' "name": "CentOS 7"\n'
' }\n'
' },\n'
' "fedora": {\n'
' "versions": [\n'
' {\n'
' "version": "^20$",\n'
' "distribution": "fedora",\n'
' "image": "fedora:20",\n'
' "name": "Fedora 20"\n'
' },\n'
# Fedora 21-24 don't have tar
' {\n'
' "version": "^25$",\n'
' "distribution": "fedora",\n'
' "image": "fedora:25",\n'
' "name": "Fedora 25"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "fedora",\n'
' "image": "fedora:25",\n'
' "name": "Fedora 25"\n'
' }\n'
' }\n'
' }\n'
' },\n'
' "vagrant_boxes": {\n'
' "default": "debian",\n'
' "boxes": {\n'
' "ubuntu": {\n'
' "versions": [\n'
' {\n'
' "version": "^12\\\\.04$",\n'
' "distribution": "ubuntu",\n'
' "architectures": {\n'
' "i686": "hashicorp/precise32",\n'
' "x86_64": "hashicorp/precise64"\n'
' },\n'
' "name": "Ubuntu 12.04 \'Precise\'"\n'
' },\n'
' {\n'
' "version": "^14\\\\.04$",\n'
' "distribution": "ubuntu",\n'
' "architectures": {\n'
' "i686": "ubuntu/trusty32",\n'
' "x86_64": "ubuntu/trusty64"\n'
' },\n'
' "name": "Ubuntu 14.04 \'Trusty\'"\n'
' },\n'
' {\n'
' "version": "^15\\\\.04$",\n'
' "distribution": "ubuntu",\n'
' "architectures": {\n'
' "i686": "ubuntu/vivid32",\n'
' "x86_64": "ubuntu/vivid64"\n'
' },\n'
' "name": "Ubuntu 15.04 \'Vivid\'"\n'
' },\n'
' {\n'
' "version": "^15\\\\.10$",\n'
' "distribution": "ubuntu",\n'
' "architectures": {\n'
' "i686": "ubuntu/wily32",\n'
' "x86_64": "ubuntu/wily64"\n'
' },\n'
' "name": "Ubuntu 15.10 \'Wily\'"\n'
' },\n'
' {\n'
' "version": "^16\\\\.04$",\n'
' "distribution": "ubuntu",\n'
' "architectures": {\n'
' "i686": "bento/ubuntu-16.04-i386",\n'
' "x86_64": "bento/ubuntu-16.04"\n'
' },\n'
' "name": "Ubuntu 16.04 \'Xenial\'"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "ubuntu",\n'
' "architectures": {\n'
' "i686": "bento/ubuntu-16.04-i386",\n'
' "x86_64": "bento/ubuntu-16.04"\n'
' },\n'
' "name": "Ubuntu 16.04 \'Xenial\'"\n'
' }\n'
' },\n'
' "debian": {\n'
' "versions": [\n'
' {\n'
' "version": "^(7(\\\\.|$))|(wheezy)",\n'
' "distribution": "debian",\n'
' "architectures": {\n'
' "i686": "remram/debian-7-i386",\n'
' "x86_64": "remram/debian-7-amd64"\n'
' },\n'
' "name": "Debian 7 \'Wheezy\'"\n'
' },\n'
' {\n'
' "version": "^(8(\\\\.|$))|(jessie)",\n'
' "distribution": "debian",\n'
' "architectures": {\n'
' "i686": "remram/debian-8-i386",\n'
' "x86_64": "remram/debian-8-amd64"\n'
' },\n'
' "name": "Debian 8 \'Jessie\'"\n'
' },\n'
' {\n'
' "version": "^(9(\\\\.|$))|(stretch)",\n'
' "distribution": "debian",\n'
' "architectures": {\n'
' "i686": "remram/debian-9-i386",\n'
' "x86_64": "remram/debian-9-amd64"\n'
' },\n'
' "name": "Debian 9 \'Stretch\'"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "debian",\n'
' "architectures": {\n'
' "i686": "remram/debian-8-i386",\n'
' "x86_64": "remram/debian-8-amd64"\n'
' },\n'
' "name": "Debian 8 \'Jessie\'"\n'
' }\n'
' },\n'
' "centos": {\n'
' "versions": [\n'
' {\n'
' "version": "^5\\\\.",\n'
' "distribution": "centos",\n'
' "architectures": {\n'
' "i686": "bento/centos-5.11-i386",\n'
' "x86_64": "bento/centos-5.11"\n'
' },\n'
' "name": "CentOS 5.11"\n'
' },\n'
' {\n'
' "version": "^6\\\\.",\n'
' "distribution": "centos",\n'
' "architectures": {\n'
' "i686": "bento/centos-6.7-i386",\n'
' "x86_64": "bento/centos-6.7"\n'
' },\n'
' "name": "CentOS 6.7"\n'
' },\n'
' {\n'
' "version": "^7\\\\.",\n'
' "distribution": "centos",\n'
' "architectures": {\n'
' "x86_64": "bento/centos-7.2"\n'
' },\n'
' "name": "CentOS 7.2"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "centos",\n'
' "architectures": {\n'
' "i686": "bento/centos-6.7-i386",\n'
' "x86_64": "bento/centos-6.7"\n'
' },\n'
' "name": "CentOS 6.7"\n'
' }\n'
' },\n'
' "centos linux": {\n'
' "versions": [\n'
' {\n'
' "version": "^5\\\\.",\n'
' "distribution": "centos",\n'
' "architectures": {\n'
' "i686": "bento/centos-5.11-i386",\n'
' "x86_64": "bento/centos-5.11"\n'
' },\n'
' "name": "CentOS 5.11"\n'
' },\n'
' {\n'
' "version": "^6\\\\.",\n'
' "distribution": "centos",\n'
' "architectures": {\n'
' "i686": "bento/centos-6.7-i386",\n'
' "x86_64": "bento/centos-6.7"\n'
' },\n'
' "name": "CentOS 6.7"\n'
' },\n'
' {\n'
' "version": "^7\\\\.",\n'
' "distribution": "centos",\n'
' "architectures": {\n'
' "x86_64": "bento/centos-7.2"\n'
' },\n'
' "name": "CentOS 7.2"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "centos",\n'
' "architectures": {\n'
' "i686": "bento/centos-6.7-i386",\n'
' "x86_64": "bento/centos-6.7"\n'
' },\n'
' "name": "CentOS 6.7"\n'
' }\n'
' },\n'
' "fedora": {\n'
' "versions": [\n'
' {\n'
' "version": "^22$",\n'
' "distribution": "fedora",\n'
' "architectures": {\n'
' "i686": "remram/fedora-22-i386",\n'
' "x86_64": "remram/fedora-22-amd64"\n'
' },\n'
' "name": "Fedora 22"\n'
' },\n'
' {\n'
' "version": "^23$",\n'
' "distribution": "fedora",\n'
' "architectures": {\n'
' "i686": "remram/fedora-23-i386",\n'
' "x86_64": "remram/fedora-23-amd64"\n'
' },\n'
' "name": "Fedora 23"\n'
' },\n'
' {\n'
' "version": "^24$",\n'
' "distribution": "fedora",\n'
' "architectures": {\n'
' "i686": "remram/fedora-24-i386",\n'
' "x86_64": "remram/fedora-24-amd64"\n'
' },\n'
' "name": "Fedora 24"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "fedora",\n'
' "architectures": {\n'
' "i686": "remram/fedora-24-i386",\n'
' "x86_64": "remram/fedora-24-amd64"\n'
' },\n'
' "name": "Fedora 24"\n'
' }\n'
' }\n'
' }\n'
' },\n'
' "vagrant_boxes_x": {\n'
' "default": "debian",\n'
' "boxes": {\n'
' "ubuntu": {\n'
' "versions": [\n'
' {\n'
' "version": "^16\\\\.04$",\n'
' "distribution": "ubuntu",\n'
' "architectures": {\n'
' "i686": "remram/ubuntu-1604-amd64-x",\n'
' "x86_64": "remram/ubuntu-1604-amd64-x"\n'
' },\n'
' "name": "Ubuntu 16.04 \'Xenial\'"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "ubuntu",\n'
' "architectures": {\n'
' "i686": "remram/ubuntu-1604-amd64-x",\n'
' "x86_64": "remram/ubuntu-1604-amd64-x"\n'
' },\n'
' "name": "Ubuntu 16.04 \'Xenial\'"\n'
' }\n'
' },\n'
' "debian": {\n'
' "versions": [\n'
' {\n'
' "version": "^(8(\\\\.|$))|(jessie)",\n'
' "distribution": "debian",\n'
' "architectures": {\n'
' "i686": "remram/debian-8-amd64-x",\n'
' "x86_64": "remram/debian-8-amd64-x"\n'
' },\n'
' "name": "Debian 8 \'Jessie\'"\n'
' }\n'
' ],\n'
' "default": {\n'
' "distribution": "debian",\n'
' "architectures": {\n'
' "i686": "remram/debian-8-amd64-x",\n'
' "x86_64": "remram/debian-8-amd64-x"\n'
' },\n'
' "name": "Debian 8 \'Jessie\'"\n'
' }\n'
' }\n'
' }\n'
' }\n'
'}\n'
)
reprounzip-1.0.10/reprounzip/plugins/ 0000755 0000765 0000024 00000000000 13130663165 020316 5 ustar remram staff 0000000 0000000 reprounzip-1.0.10/reprounzip/plugins/__init__.py 0000644 0000765 0000024 00000000320 13033760435 022422 0 ustar remram staff 0000000 0000000 try: # pragma: no cover
__import__('pkg_resources').declare_namespace(__name__)
except ImportError: # pragma: no cover
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
reprounzip-1.0.10/reprounzip/signals.py 0000644 0000765 0000024 00000010704 13073250224 020643 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Signal system.
Emitting and subscribing to these signals is the framework for the plugin
infrastructure.
"""
from __future__ import division, print_function, unicode_literals
import traceback
import warnings
from reprounzip.utils import irange, iteritems
class SignalWarning(UserWarning):
"""Warning from the Signal class.
Mainly useful for testing (to turn these to errors), however a 'signal:'
prefix is actually used in the messages because of Python bug 22543
http://bugs.python.org/issue22543
"""
class Signal(object):
"""A signal, with its set of arguments.
This holds the expected parameters that the signal expects, in several
categories:
* `expected_args` are the arguments of the signals that must be set. Trying
to emit the signal without these will show a warning and won't touch the
listeners. Listeners can rely on these being set.
* `new_args` are new arguments that listeners cannot yet rely on but that
emitters should try to pass in. Missing arguments doesn't show a warning
yet but might in the future.
* `old_args` are arguments that you might still pass in but that you should
move away from; they will show a warning stating their deprecation.
Listeners can subscribe to a signal, and may be any callable hashable
object.
"""
REQUIRED, OPTIONAL, DEPRECATED = irange(3)
def __init__(self, expected_args=[], new_args=[], old_args=[]):
self._args = {}
self._args.update((arg, Signal.REQUIRED) for arg in expected_args)
self._args.update((arg, Signal.OPTIONAL) for arg in new_args)
self._args.update((arg, Signal.DEPRECATED) for arg in old_args)
if (len(expected_args) + len(new_args) + len(old_args) !=
len(self._args)):
raise ValueError("Repeated argument names")
self._listeners = set()
def __call__(self, **kwargs):
info = {}
for arg, argtype in iteritems(self._args):
if argtype == Signal.REQUIRED:
try:
info[arg] = kwargs.pop(arg)
except KeyError:
warnings.warn("signal: Missing required argument %s; "
"signal ignored" % arg,
category=SignalWarning,
stacklevel=2)
return
else:
if arg in kwargs:
info[arg] = kwargs.pop(arg)
if argtype == Signal.DEPRECATED:
warnings.warn(
"signal: Argument %s is deprecated" % arg,
category=SignalWarning,
stacklevel=2)
if kwargs:
arg = next(iter(kwargs))
warnings.warn(
"signal: Unexpected argument %s; signal ignored" % arg,
category=SignalWarning,
stacklevel=2)
return
for listener in self._listeners:
try:
listener(**info)
except Exception:
traceback.print_exc()
warnings.warn("signal: Got an exception calling a signal",
category=SignalWarning)
def subscribe(self, func):
"""Adds the given callable to the listeners.
It must be callable and hashable (it will be put in a set).
It will be called with the signals' arguments as keywords. Because new
parameters might be introduced, it should accept these by using::
def my_listener(param1, param2, **kwargs_):
"""
if not callable(func):
raise TypeError("%r object is not callable" % type(func))
self._listeners.add(func)
def unsubscribe(self, func):
"""Removes the given callable from the listeners.
If the listener wasn't subscribed, does nothing.
"""
self._listeners.discard(func)
pre_setup = Signal(['target', 'pack'])
post_setup = Signal(['target'], ['pack'])
pre_destroy = Signal(['target'])
post_destroy = Signal(['target'])
pre_run = Signal(['target'])
post_run = Signal(['target', 'retcode'])
pre_parse_args = Signal(['parser', 'subparsers'])
post_parse_args = Signal(['args'])
application_finishing = Signal(['reason'])
unpacker = None
reprounzip-1.0.10/reprounzip/unpackers/ 0000755 0000765 0000024 00000000000 13130663165 020630 5 ustar remram staff 0000000 0000000 reprounzip-1.0.10/reprounzip/unpackers/__init__.py 0000644 0000765 0000024 00000000320 13033760435 022734 0 ustar remram staff 0000000 0000000 try: # pragma: no cover
__import__('pkg_resources').declare_namespace(__name__)
except ImportError: # pragma: no cover
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
reprounzip-1.0.10/reprounzip/unpackers/common/ 0000755 0000765 0000024 00000000000 13130663165 022120 5 ustar remram staff 0000000 0000000 reprounzip-1.0.10/reprounzip/unpackers/common/__init__.py 0000644 0000765 0000024 00000003123 13127776450 024240 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Utility functions for unpacker plugins.
This contains functions related to shell scripts, package managers, and the
pack files.
"""
from __future__ import division, print_function, unicode_literals
from reprounzip.utils import join_root
from reprounzip.unpackers.common.misc import UsageError, \
COMPAT_OK, COMPAT_NO, COMPAT_MAYBE, \
composite_action, target_must_exist, unique_names, \
make_unique_name, shell_escape, load_config, busybox_url, sudo_url, \
FileUploader, FileDownloader, get_runs, add_environment_options, \
fixup_environment, interruptible_call, \
metadata_read, metadata_write, metadata_initial_iofiles, \
metadata_update_run, parse_ports
from reprounzip.unpackers.common.packages import THIS_DISTRIBUTION, \
PKG_NOT_INSTALLED, CantFindInstaller, select_installer
__all__ = ['THIS_DISTRIBUTION', 'PKG_NOT_INSTALLED', 'select_installer',
'COMPAT_OK', 'COMPAT_NO', 'COMPAT_MAYBE',
'UsageError', 'CantFindInstaller',
'composite_action', 'target_must_exist', 'unique_names',
'make_unique_name', 'shell_escape', 'load_config', 'busybox_url',
'sudo_url',
'join_root', 'FileUploader', 'FileDownloader', 'get_runs',
'add_environment_options', 'fixup_environment',
'interruptible_call', 'metadata_read', 'metadata_write',
'metadata_initial_iofiles', 'metadata_update_run',
'parse_ports']
reprounzip-1.0.10/reprounzip/unpackers/common/misc.py 0000644 0000765 0000024 00000046010 13127776450 023436 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Miscellaneous utilities for unpacker plugins.
"""
from __future__ import division, print_function, unicode_literals
import copy
import functools
import logging
import itertools
import os
import pickle
import random
import re
import warnings
from rpaths import PosixPath, Path
import signal
import subprocess
import sys
import tarfile
import reprounzip.common
from reprounzip.common import RPZPack
from reprounzip.parameters import get_parameter
from reprounzip.utils import irange, iteritems, itervalues, stdout_bytes, \
unicode_, join_root, copyfile
COMPAT_OK = 0
COMPAT_NO = 1
COMPAT_MAYBE = 2
class UsageError(Exception):
def __init__(self, msg="Invalid command-line"):
Exception.__init__(self, msg)
def composite_action(*functions):
"""Makes an action that just calls several other actions in sequence.
Useful to implement ``myplugin setup`` in terms of ``myplugin setup/part1``
and ``myplugin setup/part2``: simply use
``act1n2 = composite_action(act1, act2)``.
"""
def wrapper(args):
for function in functions:
function(args)
return wrapper
def target_must_exist(func):
"""Decorator that checks that ``args.target`` exists.
"""
@functools.wraps(func)
def wrapper(args):
target = Path(args.target[0])
if not target.is_dir():
logging.critical("Error: Target directory doesn't exist")
raise UsageError
return func(args)
return wrapper
def unique_names():
"""Generates unique sequences of bytes.
"""
characters = (b"abcdefghijklmnopqrstuvwxyz"
b"0123456789")
characters = [characters[i:i + 1] for i in irange(len(characters))]
rng = random.Random()
while True:
letters = [rng.choice(characters) for i in irange(10)]
yield b''.join(letters)
unique_names = unique_names()
def make_unique_name(prefix):
"""Makes a unique (random) bytestring name, starting with the given prefix.
"""
assert isinstance(prefix, bytes)
return prefix + next(unique_names)
safe_shell_chars = set("ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyz"
"0123456789"
"-+=/:.,%_")
def shell_escape(s):
r"""Given bl"a, returns "bl\\"a".
"""
if isinstance(s, bytes):
s = s.decode('utf-8')
if not s or any(c not in safe_shell_chars for c in s):
return '"%s"' % (s.replace('\\', '\\\\')
.replace('"', '\\"')
.replace('`', '\\`')
.replace('$', '\\$'))
else:
return s
def load_config(pack):
"""Utility method loading the YAML configuration from inside a pack file.
Decompresses the config.yml file from the tarball to a temporary file then
loads it. Note that decompressing a single file is inefficient, thus
calling this method can be slow.
"""
rpz_pack = RPZPack(pack)
with rpz_pack.with_config() as configfile:
return reprounzip.common.load_config(configfile, canonical=True)
def busybox_url(arch):
"""Gets the correct URL for the busybox binary given the architecture.
"""
return get_parameter('busybox_url')[arch]
def sudo_url(arch):
"""Gets the correct URL for the rpzsudo binary given the architecture.
"""
return get_parameter('rpzsudo_url')[arch]
class FileUploader(object):
"""Common logic for 'upload' commands.
"""
data_tgz = 'data.tgz'
def __init__(self, target, input_files, files):
self.target = target
self.input_files = input_files
self.run(files)
def run(self, files):
reprounzip.common.record_usage(upload_files=len(files))
inputs_outputs = self.get_config().inputs_outputs
# No argument: list all the input files and exit
if not files:
print("Input files:")
for input_name in sorted(n for n, f in iteritems(inputs_outputs)
if f.read_runs):
assigned = self.input_files.get(input_name)
if assigned is None:
assigned = "(original)"
elif assigned is False:
assigned = "(not created)"
elif assigned is True:
assigned = "(generated)"
else:
assert isinstance(assigned, (bytes, unicode_))
print(" %s: %s" % (input_name, assigned))
return
self.prepare_upload(files)
try:
# Upload files
for filespec in files:
filespec_split = filespec.rsplit(':', 1)
if len(filespec_split) != 2:
logging.critical("Invalid file specification: %r",
filespec)
sys.exit(1)
local_path, input_name = filespec_split
try:
input_path = inputs_outputs[input_name].path
except KeyError:
logging.critical("Invalid input file: %r", input_name)
sys.exit(1)
temp = None
if not local_path:
# Restore original file from pack
logging.debug("Restoring input file %s", input_path)
fd, temp = Path.tempfile(prefix='reprozip_input_')
os.close(fd)
local_path = self.extract_original_input(input_name,
input_path,
temp)
if local_path is None:
temp.remove()
logging.warning("No original packed, can't restore "
"input file %s", input_name)
continue
else:
local_path = Path(local_path)
logging.debug("Uploading file %s to %s",
local_path, input_path)
if not local_path.exists():
logging.critical("Local file %s doesn't exist",
local_path)
sys.exit(1)
self.upload_file(local_path, input_path)
if temp is not None:
temp.remove()
self.input_files.pop(input_name, None)
else:
self.input_files[input_name] = local_path.absolute().path
finally:
self.finalize()
def get_config(self):
return reprounzip.common.load_config(self.target / 'config.yml',
canonical=True)
def prepare_upload(self, files):
pass
def extract_original_input(self, input_name, input_path, temp):
tar = tarfile.open(str(self.target / self.data_tgz), 'r:*')
try:
member = tar.getmember(str(join_root(PosixPath('DATA'),
input_path)))
except KeyError:
return None
member = copy.copy(member)
member.name = str(temp.components[-1])
tar.extract(member, str(temp.parent))
tar.close()
return temp
def upload_file(self, local_path, input_path):
raise NotImplementedError
def finalize(self):
pass
class FileDownloader(object):
"""Common logic for 'download' commands.
"""
def __init__(self, target, files, all_=False):
self.target = target
self.run(files, all_)
def run(self, files, all_):
reprounzip.common.record_usage(download_files=len(files))
inputs_outputs = self.get_config().inputs_outputs
# No argument: list all the output files and exit
if not (all_ or files):
print("Output files:")
for output_name in sorted(n for n, f in iteritems(inputs_outputs)
if f.write_runs):
print(" %s" % output_name)
return
# Parse the name[:path] syntax
resolved_files = []
all_files = set(n for n, f in iteritems(inputs_outputs)
if f.write_runs)
for filespec in files:
filespec_split = filespec.split(':', 1)
if len(filespec_split) == 1:
output_name = local_path = filespec
elif len(filespec_split) == 2:
output_name, local_path = filespec_split
else:
logging.critical("Invalid file specification: %r",
filespec)
sys.exit(1)
local_path = Path(local_path) if local_path else None
all_files.discard(output_name)
resolved_files.append((output_name, local_path))
# If all_ is set, add all the files that weren't explicitely named
if all_:
for output_name in all_files:
resolved_files.append((output_name, Path(output_name)))
self.prepare_download(resolved_files)
success = True
try:
# Download files
for output_name, local_path in resolved_files:
try:
remote_path = inputs_outputs[output_name].path
except KeyError:
logging.critical("Invalid output file: %r", output_name)
sys.exit(1)
logging.debug("Downloading file %s", remote_path)
if local_path is None:
ret = self.download_and_print(remote_path)
else:
ret = self.download(remote_path, local_path)
if ret is None:
ret = True
warnings.warn("download() returned None instead of "
"True/False, assuming True",
category=DeprecationWarning)
if not ret:
success = False
if not success:
sys.exit(1)
finally:
self.finalize()
def get_config(self):
return reprounzip.common.load_config(self.target / 'config.yml',
canonical=True)
def prepare_download(self, files):
pass
def download_and_print(self, remote_path):
# Download to temporary file
fd, temp = Path.tempfile(prefix='reprozip_output_')
os.close(fd)
download_status = self.download(remote_path, temp)
if download_status is not None and not download_status:
return False
# Output to stdout
with temp.open('rb') as fp:
copyfile(fp, stdout_bytes)
temp.remove()
return True
def download(self, remote_path, local_path):
raise NotImplementedError
def finalize(self):
pass
def get_runs(runs, selected_runs, cmdline):
"""Selects which run(s) to execute based on parts of the command-line.
Will return an iterable of run numbers. Might also fail loudly or exit
after printing the original command-line.
"""
name_map = dict((r['id'], i) for i, r in enumerate(runs) if 'id' in r)
run_list = []
def parse_run(s):
try:
r = int(s)
except ValueError:
logging.critical("Error: Unknown run %s", s)
raise UsageError
if r < 0 or r >= len(runs):
logging.critical("Error: Expected 0 <= run <= %d, got %d",
len(runs) - 1, r)
sys.exit(1)
return r
if selected_runs is None:
run_list = list(irange(len(runs)))
else:
for run_item in selected_runs.split(','):
run_item = run_item.strip()
if run_item in name_map:
run_list.append(name_map[run_item])
continue
sep = run_item.find('-')
if sep == -1:
run_list.append(parse_run(run_item))
else:
if sep > 0:
first = parse_run(run_item[:sep])
else:
first = 0
if sep + 1 < len(run_item):
last = parse_run(run_item[sep + 1:])
else:
last = len(runs) - 1
if last < first:
logging.critical("Error: Last run number should be "
"greater than the first")
sys.exit(1)
run_list.extend(irange(first, last + 1))
# --cmdline without arguments: display the original command-line
if cmdline == []:
print("Original command-lines:")
for run in run_list:
print(' '.join(shell_escape(arg)
for arg in runs[run]['argv']))
sys.exit(0)
return run_list
def add_environment_options(parser):
parser.add_argument('--pass-env', action='append', default=[],
help="Environment variable to pass through from the "
"host (value from the original machine will be "
"overridden; can be passed multiple times)")
parser.add_argument('--set-env', action='append', default=[],
help="Environment variable to set (value from the "
"original machine will be ignored; can be passed "
"multiple times)")
def fixup_environment(environ, args):
if not (args.pass_env or args.set_env):
return environ
environ = dict(environ)
regexes = [re.compile(pattern + '$') for pattern in args.pass_env]
for var in os.environ:
if any(regex.match(var) for regex in regexes):
environ[var] = os.environ[var]
for var in args.set_env:
if '=' in var:
var, value = var.split('=', 1)
environ[var] = value
else:
environ.pop(var, None)
return environ
def interruptible_call(*args, **kwargs):
assert signal.getsignal(signal.SIGINT) == signal.default_int_handler
proc = [None]
def _sigint_handler(signum, frame):
if proc[0] is not None:
try:
proc[0].send_signal(signum)
except OSError:
pass
signal.signal(signal.SIGINT, _sigint_handler)
try:
proc[0] = subprocess.Popen(*args, **kwargs)
return proc[0].wait()
finally:
signal.signal(signal.SIGINT, signal.default_int_handler)
def metadata_read(path, type_):
"""Read the unpacker-specific metadata from an unpacked directory.
:param path: The unpacked directory; `.reprounzip` will be appended to get
the name of the pickle file.
:param type_: The name of the unpacker, to check for consistency.
Unpackers need to store some specific information, along with the status of
the input files. This is done in a consistent way so that showfiles can
access it (and because duplicating code is not necessary here).
It's a simple pickled dictionary under path / '.reprounzip'. The
'input_files' key stores the status of the input files.
If you change it, don't forget to call `metadata_write` to write it to disk
again.
"""
filename = path / '.reprounzip'
if not filename.exists():
logging.critical("Required metadata missing, did you point this "
"command at the directory you created using the "
"'setup' command?")
raise UsageError
with filename.open('rb') as fp:
dct = pickle.load(fp)
if type_ is not None and dct['unpacker'] != type_:
logging.critical("Wrong unpacker used: %s != %s",
dct['unpacker'], type_)
raise UsageError
return dct
def metadata_write(path, dct, type_):
"""Write the unpacker-specific metadata in an unpacked directory.
:param path: The unpacked directory; `.reprounzip` will be appended to get
the name of the pickle file.
:param type_: The name of the unpacker, that is written to the pickle file
under the key 'unpacker'.
:param dct: The dictionary with the info to write to the file.
"""
filename = path / '.reprounzip'
to_write = {'unpacker': type_}
to_write.update(dct)
with filename.open('wb') as fp:
pickle.dump(to_write, fp, 2)
def metadata_initial_iofiles(config, dct=None):
"""Add the initial state of the {in/out}put files to the unpacker metadata.
:param config: The configuration as returned by `load_config()`, which will
be used to list the input and output files and to determine which ones have
been packed (and therefore exist initially).
The `input_files` key contains a dict mapping the name to either:
* None (or inexistent): original file and exists
* False: doesn't exist (wasn't packed)
* True: has been generated by one of the run since the experiment was
unpacked
* basestring: the user uploaded a file with this path, and no run has
overwritten it yet
"""
if dct is None:
dct = {}
path2iofile = {f.path: n
for n, f in iteritems(config.inputs_outputs)}
def packed_files():
yield config.other_files
for pkg in config.packages:
if pkg.packfiles:
yield pkg.files
for f in itertools.chain.from_iterable(packed_files()):
f = f.path
path2iofile.pop(f, None)
dct['input_files'] = dict((n, False) for n in itervalues(path2iofile))
return dct
def metadata_update_run(config, dct, runs):
"""Update the unpacker metadata after some runs have executed.
:param runs: An iterable of run numbers that were probably executed.
This maintains a crude idea of the status of input and output files by
updating the files that are outputs of the runs that were just executed.
This means that files that were uploaded by the user will no longer be
shown as uploaded (they have been overwritten by the experiment) and files
that weren't packed exist from now on.
This is not very reliable because a run might have created a file that is
not designated as its output anyway, or might have failed and thus not
created the output (or a bad output).
"""
runs = set(runs)
input_files = dct.setdefault('input_files', {})
for name, fi in iteritems(config.inputs_outputs):
if any(r in runs for r in fi.write_runs):
input_files[name] = True
_port_re = re.compile('^(?:([0-9]+):)?([0-9]+)(?:/([a-z]+))?$')
def parse_ports(specifications):
ports = []
for port in specifications:
m = _port_re.match(port)
if m is None:
logging.critical("Invalid port specification: '%s'", port)
sys.exit(1)
host, experiment, proto = m.groups()
if not host:
host = experiment
if not proto:
proto = 'tcp'
ports.append((int(host), int(experiment), proto))
return ports
reprounzip-1.0.10/reprounzip/unpackers/common/packages.py 0000644 0000765 0000024 00000014501 13073250224 024243 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Utility functions dealing with package managers.
"""
from __future__ import division, print_function, unicode_literals
import logging
import platform
import subprocess
from reprounzip.unpackers.common.misc import UsageError
from reprounzip.utils import itervalues
THIS_DISTRIBUTION = platform.linux_distribution()[0].lower()
PKG_NOT_INSTALLED = "(not installed)"
class CantFindInstaller(UsageError):
def __init__(self, msg="Can't select a package installer"):
UsageError.__init__(self, msg)
class AptInstaller(object):
"""Installer for deb-based systems (Debian, Ubuntu).
"""
def __init__(self, binary):
self.bin = binary
def install(self, packages, assume_yes=False):
# Installs
options = []
if assume_yes:
options.append('-y')
required_pkgs = set(pkg.name for pkg in packages)
r = subprocess.call([self.bin, 'install'] +
options + list(required_pkgs))
# Checks on packages
pkgs_status = self.get_packages_info(packages)
for pkg, status in itervalues(pkgs_status):
if status is not None:
required_pkgs.discard(pkg.name)
if required_pkgs:
logging.error("Error: some packages could not be installed:%s",
''.join("\n %s" % pkg for pkg in required_pkgs))
return r, pkgs_status
@staticmethod
def get_packages_info(packages):
if not packages:
return {}
p = subprocess.Popen(['dpkg-query',
'--showformat=${Package;-50}\t${Version}\n',
'-W'] +
[pkg.name for pkg in packages],
stdout=subprocess.PIPE)
# name -> (pkg, installed_version)
pkgs_dict = dict((pkg.name, (pkg, PKG_NOT_INSTALLED))
for pkg in packages)
try:
for l in p.stdout:
fields = l.split()
if len(fields) == 2:
name = fields[0].decode('ascii')
status = fields[1].decode('ascii')
pkg, _ = pkgs_dict[name]
pkgs_dict[name] = pkg, status
finally:
p.wait()
return pkgs_dict
def update_script(self):
return '%s update' % self.bin
def install_script(self, packages):
return '%s install -y %s' % (self.bin,
' '.join(pkg.name for pkg in packages))
class YumInstaller(object):
"""Installer for systems using RPM and Yum (Fedora, CentOS, Red-Hat).
"""
@classmethod
def install(cls, packages, assume_yes=False):
options = []
if assume_yes:
options.append('-y')
required_pkgs = set(pkg.name for pkg in packages)
r = subprocess.call(['yum', 'install'] + options + list(required_pkgs))
# Checks on packages
pkgs_status = cls.get_packages_info(packages)
for pkg, status in itervalues(pkgs_status):
if status is not None:
required_pkgs.discard(pkg.name)
if required_pkgs:
logging.error("Error: some packages could not be installed:%s",
''.join("\n %s" % pkg for pkg in required_pkgs))
return r, pkgs_status
@staticmethod
def get_packages_info(packages):
if not packages:
return {}
p = subprocess.Popen(['rpm', '-q'] +
[pkg.name for pkg in packages] +
['--qf', '+%{NAME} %{VERSION}-%{RELEASE}\\n'],
stdout=subprocess.PIPE)
# name -> {pkg, installed_version}
pkgs_dict = dict((pkg.name, (pkg, PKG_NOT_INSTALLED))
for pkg in packages)
try:
for l in p.stdout:
if l[0] == b'+':
fields = l[1:].split()
if len(fields) == 2:
name = fields[0].decode('ascii')
status = fields[1].decode('ascii')
pkg, _ = pkgs_dict[name]
pkgs_dict[name] = pkg, status
finally:
p.wait()
return pkgs_dict
@staticmethod
def update_script():
return ''
@staticmethod
def install_script(packages):
return 'yum install -y %s' % ' '.join(pkg.name for pkg in packages)
def select_installer(pack, runs, target_distribution=THIS_DISTRIBUTION,
check_distrib_compat=True):
"""Selects the right package installer for a Linux distribution.
"""
orig_distribution = runs[0]['distribution'][0].lower()
# Checks that the distributions match
if not check_distrib_compat:
pass
elif (set([orig_distribution, target_distribution]) ==
set(['ubuntu', 'debian'])):
# Packages are more or less the same on Debian and Ubuntu
logging.warning("Installing on %s but pack was generated on %s",
target_distribution.capitalize(),
orig_distribution.capitalize())
elif target_distribution is None:
raise CantFindInstaller("Target distribution is unknown; try using "
"--distribution")
elif orig_distribution != target_distribution:
raise CantFindInstaller(
"Installing on %s but pack was generated on %s" % (
target_distribution.capitalize(),
orig_distribution.capitalize()))
# Selects installation method
if target_distribution == 'ubuntu':
installer = AptInstaller('apt-get')
elif target_distribution == 'debian':
# aptitude is not installed by default, so use apt-get here too
installer = AptInstaller('apt-get')
elif (target_distribution in ('centos', 'centos linux',
'fedora', 'scientific linux') or
target_distribution.startswith('red hat')):
installer = YumInstaller()
else:
raise CantFindInstaller("This distribution, \"%s\", is not supported" %
target_distribution.capitalize())
return installer
reprounzip-1.0.10/reprounzip/unpackers/common/x11.py 0000644 0000765 0000024 00000035326 13127722141 023110 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Utility functions dealing with X servers.
"""
from __future__ import division, print_function, unicode_literals
import contextlib
import logging
import os
from rpaths import Path, PosixPath
import select
import socket
import struct
import threading
from reprounzip.utils import irange, iteritems
# #include
#
# typedef struct xauth {
# unsigned short family;
# unsigned short address_length;
# char *address;
# unsigned short number_length;
# char *number;
# unsigned short name_length;
# char *name;
# unsigned short data_length;
# char *data;
# } Xauth;
_read_short = lambda fp: struct.unpack('>H', fp.read(2))[0]
_write_short = lambda i: struct.pack('>H', i)
def ascii(s):
if isinstance(s, bytes):
return s
else:
return s.encode('ascii')
class Xauth(object):
"""A record in an Xauthority file.
"""
FAMILY_LOCAL = 256
FAMILY_INTERNET = 0
FAMILY_DECNET = 1
FAMILY_CHAOS = 2
FAMILY_INTERNET6 = 6
FAMILY_SERVERINTERPRETED = 5
def __init__(self, family, address, number, name, data):
self.family = family
self.address = address
self.number = number
self.name = name
self.data = data
@classmethod
def from_file(cls, fp):
family = _read_short(fp)
address_length = _read_short(fp)
address = fp.read(address_length)
number_length = _read_short(fp)
number = int(fp.read(number_length))
name_length = _read_short(fp)
name = fp.read(name_length)
data_length = _read_short(fp)
data = fp.read(data_length)
return cls(family, address, number, name, data)
def as_bytes(self):
number = ('%d' % self.number).encode('ascii')
return (_write_short(self.family) +
_write_short(len(self.address)) +
ascii(self.address) +
_write_short(len(number)) +
number +
_write_short(len(self.name)) +
ascii(self.name) +
_write_short(len(self.data)) +
ascii(self.data))
class BaseX11Handler(object):
"""X11 handler.
This selects a way to connect to the local X server and an authentication
mechanism. If provides `fix_env()` to set the X environment variable for
the experiment, `init_cmds` to setup X before running the experiment's main
commands, and `port_forward` which describes the reverse port tunnels from
the experiment to the local X server.
"""
class X11Handler(BaseX11Handler):
"""X11 handler that will connect to a server outside on the host.
This connects out of the created environment using the network. It is used
by Vagrant (through SSH) and Docker (TCP connection), and may have
significant latency.
"""
DISPLAY_NUMBER = 15
SOCK2X = {socket.AF_INET: Xauth.FAMILY_INTERNET,
socket.AF_INET6: Xauth.FAMILY_INTERNET6}
X2SOCK = dict((v, k) for k, v in iteritems(SOCK2X))
def __init__(self, enabled, target, display=None):
self.enabled = enabled
if not self.enabled:
return
self.target = target
self.xauth = PosixPath('/.reprounzip_xauthority')
self.display = (int(display) if display is not None
else self.DISPLAY_NUMBER)
logging.debug("X11 support enabled; will create Xauthority file %s "
"for experiment. Display number is %d", self.xauth,
self.display)
# List of addresses that match the $DISPLAY variable
possible, local_display = self._locate_display()
tcp_portnum = ((6000 + local_display) if local_display is not None
else None)
if ('XAUTHORITY' in os.environ and
Path(os.environ['XAUTHORITY']).is_file()):
xauthority = Path(os.environ['XAUTHORITY'])
# Note: I'm assuming here that Xauthority has no XDG support
else:
xauthority = Path('~').expand_user() / '.Xauthority'
# Read Xauthority file
xauth_entries = {}
if xauthority.is_file():
with xauthority.open('rb') as fp:
fp.seek(0, os.SEEK_END)
size = fp.tell()
fp.seek(0, os.SEEK_SET)
while fp.tell() < size:
entry = Xauth.from_file(fp)
if (entry.name == 'MIT-MAGIC-COOKIE-1' and
entry.number == local_display):
if entry.family == Xauth.FAMILY_LOCAL:
xauth_entries[(entry.family, None)] = entry
elif (entry.family == Xauth.FAMILY_INTERNET or
entry.family == Xauth.FAMILY_INTERNET6):
xauth_entries[(entry.family,
entry.address)] = entry
# FIXME: this completely ignores addresses
logging.debug("Possible X endpoints: %s", (possible,))
# Select socket and authentication cookie
self.xauth_record = None
self.connection_info = None
for family, address in possible:
# Checks that we have a cookie
entry = family, (None if family is Xauth.FAMILY_LOCAL else address)
if entry not in xauth_entries:
continue
if family == Xauth.FAMILY_LOCAL and hasattr(socket, 'AF_UNIX'):
# Checks that the socket exists
if not Path(address).exists():
continue
self.connection_info = (socket.AF_UNIX, socket.SOCK_STREAM,
address)
self.xauth_record = xauth_entries[(family, None)]
logging.debug("Will connect to local X display via UNIX "
"socket %s", address)
break
else:
# Checks that we have a cookie
family = self.X2SOCK[family]
self.connection_info = (family, socket.SOCK_STREAM,
(address, tcp_portnum))
self.xauth_record = xauth_entries[(family, address)]
logging.debug("Will connect to X display %s:%d via %s/TCP",
address, tcp_portnum,
"IPv6" if family == socket.AF_INET6 else "IPv4")
break
# Didn't find an Xauthority record -- assume no authentication is
# needed, but still set self.connection_info
if self.connection_info is None:
for family, address in possible:
# Only try UNIX sockets, we'll use 127.0.0.1 otherwise
if family == Xauth.FAMILY_LOCAL:
if not hasattr(socket, 'AF_UNIX'):
continue
self.connection_info = (socket.AF_UNIX, socket.SOCK_STREAM,
address)
logging.debug("Will connect to X display via UNIX socket "
"%s, no authentication", address)
break
else:
self.connection_info = (socket.AF_INET, socket.SOCK_STREAM,
('127.0.0.1', tcp_portnum))
logging.debug("Will connect to X display 127.0.0.1:%d via "
"IPv4/TCP, no authentication",
tcp_portnum)
if self.connection_info is None:
raise RuntimeError("Couldn't determine how to connect to local X "
"server, DISPLAY is %s" % (
repr(os.environ['DISPLAY'])
if 'DISPLAY' is os.environ
else 'not set'))
@classmethod
def _locate_display(cls):
"""Reads $DISPLAY and figures out possible sockets.
"""
# We default to ":0", Xming for instance doesn't set $DISPLAY
display = os.environ.get('DISPLAY', ':0')
# It might be the full path to a UNIX socket
if display.startswith('/'):
return [(Xauth.FAMILY_LOCAL, display)], None
local_addr, local_display = display.rsplit(':', 1)
local_display = int(local_display.split('.', 1)[0])
# Let's order the socket families: IPv4 first, then v6, then others
def sort_families(gai, order={socket.AF_INET: 0, socket.AF_INET6: 1}):
return sorted(gai, key=lambda x: order.get(x[0], 999999))
# Network addresses of the local machine
local_addresses = []
for family, socktype, proto, canonname, sockaddr in \
sort_families(socket.getaddrinfo(socket.gethostname(), 6000)):
try:
family = cls.SOCK2X[family]
except KeyError:
continue
local_addresses.append((family, sockaddr[0]))
logging.debug("Local addresses: %s", (local_addresses,))
# Determine possible addresses for $DISPLAY
if not local_addr:
possible = [(Xauth.FAMILY_LOCAL,
'/tmp/.X11-unix/X%d' % local_display)]
possible += local_addresses
else:
local_possible = False
possible = []
for family, socktype, proto, canonname, sockaddr in \
sort_families(socket.getaddrinfo(local_addr, 6000)):
try:
family = cls.SOCK2X[family]
except KeyError:
continue
if (family, sockaddr[0]) in local_addresses:
local_possible = True
possible.append((family, sockaddr[0]))
if local_possible:
possible = [(Xauth.FAMILY_LOCAL,
'/tmp/.X11-unix/X%d' % local_display)] + possible
return possible, local_display
@property
def port_forward(self):
"""Builds the port forwarding info, for `run_interactive()`.
Just requests port 6015 on the remote host to be forwarded to the X
socket identified by `self.connection_info`.
"""
if not self.enabled:
return []
@contextlib.contextmanager
def connect(src_addr):
logging.info("Got remote X connection from %s", (src_addr,))
logging.debug("Connecting to X server: %s",
(self.connection_info,))
sock = socket.socket(*self.connection_info[:2])
sock.connect(self.connection_info[2])
yield sock
sock.close()
logging.info("X connection from %s closed", (src_addr,))
return [(6000 + self.display, connect)]
def fix_env(self, env):
"""Sets ``$XAUTHORITY`` and ``$DISPLAY`` in the environment.
"""
if not self.enabled:
return env
new_env = dict(env)
new_env['XAUTHORITY'] = str(self.xauth)
if self.target[0] == 'local':
new_env['DISPLAY'] = '127.0.0.1:%d' % self.display
elif self.target[0] == 'internet':
new_env['DISPLAY'] = '%s:%d' % (self.target[1], self.display)
return new_env
@property
def init_cmds(self):
"""Gets the commands to setup X on the server before the experiment.
"""
if not self.enabled or self.xauth_record is None:
return []
if self.target[0] == 'local':
xauth_record = Xauth(Xauth.FAMILY_LOCAL,
self.target[1],
self.display,
self.xauth_record.name,
self.xauth_record.data)
elif self.target[0] == 'internet':
xauth_record = Xauth(Xauth.FAMILY_INTERNET,
socket.inet_aton(self.target[1]),
self.display,
self.xauth_record.name,
self.xauth_record.data)
else:
raise RuntimeError("Invalid target display type")
buf = xauth_record.as_bytes()
xauth = ''.join(('\\x%02x' % ord(buf[i:i + 1]))
for i in irange(len(buf)))
return ['echo -ne "%s" > %s' % (xauth, self.xauth)]
class BaseForwarder(object):
"""Accepts connections and forwards to the given connector object.
The `connector` is a function which takes the address of remote process
connecting on this ends, and gives out a socket object that is the second
endpoint of the tunnel. The socket object must provide ``recv()``,
``sendall()`` and ``close()``.
Abstract class, implementations will provide actual ways to accept
connections.
"""
def __init__(self, connector):
self.connector = connector
def _forward(self, client, src_addr):
try:
with self.connector(src_addr) as local_connection:
local_fd = local_connection.fileno()
client_fd = client.fileno()
while True:
r, w, x = select.select([local_fd, client_fd], [], [])
if local_fd in r:
data = local_connection.recv(4096)
if not data:
break
client.sendall(data)
elif client_fd in r:
data = client.recv(4096)
if not data:
break
local_connection.sendall(data)
finally:
client.close()
class LocalForwarder(BaseForwarder):
"""Listens on a random port and forwards to the given connector object.
The `connector` is a function which takes the address of remote process
connecting on this ends, and gives out a socket object that is the second
endpoint of the tunnel. The socket object must provide ``recv()``,
``sendall()`` and ``close()``.
"""
def __init__(self, connector, local_port=None):
BaseForwarder.__init__(self, connector)
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(('', local_port or 0))
self.local_port = server.getsockname()[1]
server.listen(5)
t = threading.Thread(target=self._accept, args=(server,))
t.setDaemon(True)
t.start()
def _accept(self, server):
while True:
client, src_addr = server.accept()
t = threading.Thread(target=self._forward,
args=(client, src_addr))
t.setDaemon(True)
t.start()
reprounzip-1.0.10/reprounzip/unpackers/default.py 0000644 0000765 0000024 00000106452 13127776450 022646 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Default unpackers for reprounzip.
This file contains the default plugins that come with reprounzip:
- ``directory`` puts all the files in a simple directory. This is simple but
can be unreliable.
- ``chroot`` creates a chroot environment. This is more reliable as you get a
harder isolation from the host system.
- ``installpkgs`` installs on your distribution the packages that were used by
the experiment on the original machine. This is useful if some of them were
not packed and you do not have them installed.
"""
from __future__ import division, print_function, unicode_literals
import argparse
import copy
import logging
import os
import platform
from rpaths import PosixPath, DefaultAbstractPath, Path
import socket
import subprocess
import sys
import tarfile
from reprounzip.common import RPZPack, load_config as load_config_file, \
record_usage
from reprounzip import signals
from reprounzip.unpackers.common import THIS_DISTRIBUTION, PKG_NOT_INSTALLED, \
COMPAT_OK, COMPAT_NO, CantFindInstaller, target_must_exist, shell_escape, \
load_config, select_installer, busybox_url, join_root, FileUploader, \
FileDownloader, get_runs, add_environment_options, fixup_environment, \
interruptible_call, metadata_read, metadata_write, \
metadata_initial_iofiles, metadata_update_run
from reprounzip.unpackers.common.x11 import X11Handler, LocalForwarder
from reprounzip.utils import unicode_, irange, iteritems, itervalues, \
stdout_bytes, stderr, make_dir_writable, rmtree_fixed, copyfile, \
download_file
def installpkgs(args):
"""Installs the necessary packages on the current machine.
"""
if not THIS_DISTRIBUTION:
logging.critical("Not running on Linux")
sys.exit(1)
pack = args.pack[0]
missing = args.missing
# Loads config
runs, packages, other_files = load_config(pack)
try:
installer = select_installer(pack, runs)
except CantFindInstaller as e:
logging.error("Couldn't select a package installer: %s", e)
if args.summary:
# Print out a list of packages with their status
if missing:
print("Packages not present in pack:")
packages = [pkg for pkg in packages if not pkg.packfiles]
else:
print("All packages:")
pkgs = installer.get_packages_info(packages)
for pkg in packages:
print(" %s (required version: %s, status: %s)" % (
pkg.name, pkg.version, pkgs[pkg.name][1]))
else:
if missing:
# With --missing, ignore packages whose files were packed
packages = [pkg for pkg in packages if not pkg.packfiles]
# Installs packages
record_usage(installpkgs_installing=len(packages))
r, pkgs = installer.install(packages, assume_yes=args.assume_yes)
for pkg in packages:
req = pkg.version
real = pkgs[pkg.name][1]
if real == PKG_NOT_INSTALLED:
logging.warning("package %s was not installed", pkg.name)
else:
logging.warning("version %s of %s was installed, instead of "
"%s", real, pkg.name, req)
if r != 0:
logging.critical("Installer exited with %d", r)
sys.exit(r)
def directory_create(args):
"""Unpacks the experiment in a folder.
Only the files that are not part of a package are copied (unless they are
missing from the system and were packed).
In addition, input files are put in a tar.gz (so they can be put back after
an upload) and the configuration file is extracted.
"""
if not args.pack:
logging.critical("setup needs the pack filename")
sys.exit(1)
pack = Path(args.pack[0])
target = Path(args.target[0])
if target.exists():
logging.critical("Target directory exists")
sys.exit(1)
if not issubclass(DefaultAbstractPath, PosixPath):
logging.critical("Not unpacking on POSIX system")
sys.exit(1)
signals.pre_setup(target=target, pack=pack)
# Unpacks configuration file
rpz_pack = RPZPack(pack)
rpz_pack.extract_config(target / 'config.yml')
# Loads config
config = load_config_file(target / 'config.yml', True)
packages = config.packages
target.mkdir()
root = (target / 'root').absolute()
# Checks packages
missing_files = False
for pkg in packages:
if pkg.packfiles:
continue
for f in pkg.files:
if not Path(f.path).exists():
logging.error(
"Missing file %s (from package %s that wasn't packed) "
"on host, experiment will probably miss it.",
f, pkg.name)
missing_files = True
if missing_files:
record_usage(directory_missing_pkgs=True)
logging.error("Some packages are missing, you should probably install "
"them.\nUse 'reprounzip installpkgs -h' for help")
root.mkdir()
try:
# Unpacks files
members = rpz_pack.list_data()
for m in members:
# Remove 'DATA/' prefix
m.name = str(rpz_pack.remove_data_prefix(m.name))
# Makes symlink targets relative
if m.issym():
linkname = PosixPath(m.linkname)
if linkname.is_absolute:
m.linkname = join_root(root, PosixPath(m.linkname)).path
logging.info("Extracting files...")
rpz_pack.extract_data(root, members)
rpz_pack.close()
# Original input files, so upload can restore them
input_files = [f.path for f in itervalues(config.inputs_outputs)
if f.read_runs]
if input_files:
logging.info("Packing up original input files...")
inputtar = tarfile.open(str(target / 'inputs.tar.gz'), 'w:gz')
for ifile in input_files:
filename = join_root(root, ifile)
if filename.exists():
inputtar.add(str(filename), str(ifile))
inputtar.close()
# Meta-data for reprounzip
metadata_write(target, metadata_initial_iofiles(config), 'directory')
signals.post_setup(target=target, pack=pack)
except Exception:
rmtree_fixed(root)
raise
@target_must_exist
def directory_run(args):
"""Runs the command in the directory.
"""
target = Path(args.target[0])
unpacked_info = metadata_read(target, 'directory')
cmdline = args.cmdline
# Loads config
config = load_config_file(target / 'config.yml', True)
runs = config.runs
selected_runs = get_runs(runs, args.run, cmdline)
root = (target / 'root').absolute()
# Gets library paths
lib_dirs = []
p = subprocess.Popen(['/sbin/ldconfig', '-v', '-N'],
stdout=subprocess.PIPE)
try:
for l in p.stdout:
if len(l) < 3 or l[0] in (b' ', b'\t'):
continue
if l.endswith(b':\n'):
lib_dirs.append(Path(l[:-2]))
finally:
p.wait()
lib_dirs = ('export LD_LIBRARY_PATH=%s' % ':'.join(
shell_escape(unicode_(join_root(root, d)))
for d in lib_dirs))
cmds = [lib_dirs]
for run_number in selected_runs:
run = runs[run_number]
cmd = 'cd %s && ' % shell_escape(
unicode_(join_root(root,
Path(run['workingdir']))))
cmd += '/usr/bin/env -i '
environ = run['environ']
environ = fixup_environment(environ, args)
if args.x11:
if 'DISPLAY' in os.environ:
environ['DISPLAY'] = os.environ['DISPLAY']
if 'XAUTHORITY' in os.environ:
environ['XAUTHORITY'] = os.environ['XAUTHORITY']
cmd += ' '.join('%s=%s' % (shell_escape(k), shell_escape(v))
for k, v in iteritems(environ)
if k != 'PATH')
cmd += ' '
# PATH
# Get the original PATH components
path = [PosixPath(d)
for d in run['environ'].get('PATH', '').split(':')]
# The same paths but in the directory
dir_path = [join_root(root, d)
for d in path
if d.root == '/']
# Rebuild string
path = ':'.join(unicode_(d) for d in dir_path + path)
cmd += 'PATH=%s ' % shell_escape(path)
# FIXME : Use exec -a or something if binary != argv[0]
if cmdline is None:
argv = run['argv']
# Rewrites command-line arguments that are absolute filenames
rewritten = False
for i in irange(len(argv)):
try:
p = Path(argv[i])
except UnicodeEncodeError:
continue
if p.is_absolute:
rp = join_root(root, p)
if (rp.exists() or
(len(rp.components) > 3 and rp.parent.exists())):
argv[i] = str(rp)
rewritten = True
if rewritten:
logging.warning("Rewrote command-line as: %s",
' '.join(shell_escape(a) for a in argv))
else:
argv = cmdline
cmd += ' '.join(shell_escape(a) for a in argv)
cmds.append(cmd)
cmds = ' && '.join(cmds)
signals.pre_run(target=target)
retcode = interruptible_call(cmds, shell=True)
stderr.write("\n*** Command finished, status: %d\n" % retcode)
signals.post_run(target=target, retcode=retcode)
# Update input file status
metadata_update_run(config, unpacked_info, selected_runs)
metadata_write(target, unpacked_info, 'directory')
@target_must_exist
def directory_destroy(args):
"""Destroys the directory.
"""
target = Path(args.target[0])
metadata_read(target, 'directory')
logging.info("Removing directory %s...", target)
signals.pre_destroy(target=target)
rmtree_fixed(target)
signals.post_destroy(target=target)
def should_restore_owner(param):
"""Computes whether to restore original files' owners.
"""
if os.getuid() != 0:
if param is True:
# Restoring the owner was explicitely requested
logging.critical("Not running as root, cannot restore files' "
"owner/group as requested")
sys.exit(1)
elif param is None:
# Nothing was requested
logging.warning("Not running as root, won't restore files' "
"owner/group")
ret = False
else:
# If False: skip warning
ret = False
else:
if param is None:
# Nothing was requested
logging.info("Running as root, we will restore files' "
"owner/group")
ret = True
elif param is True:
ret = True
else:
# If False: skip warning
ret = False
record_usage(restore_owner=ret)
return ret
def should_mount_magic_dirs(param):
"""Computes whether to mount directories inside the chroot.
"""
if os.getuid() != 0:
if param is True:
# Restoring the owner was explicitely requested
logging.critical("Not running as root, cannot mount /dev and "
"/proc")
sys.exit(1)
elif param is None:
# Nothing was requested
logging.warning("Not running as root, won't mount /dev and /proc")
ret = False
else:
# If False: skip warning
ret = False
else:
if param is None:
# Nothing was requested
logging.info("Running as root, will mount /dev and /proc")
ret = True
elif param is True:
ret = True
else:
# If False: skip warning
ret = False
record_usage(mount_magic_dirs=ret)
return ret
def chroot_create(args):
"""Unpacks the experiment in a folder so it can be run with chroot.
All the files in the pack are unpacked; system files are copied only if
they were not packed, and busybox is installed if /bin/sh wasn't packed.
In addition, input files are put in a tar.gz (so they can be put back after
an upload) and the configuration file is extracted.
"""
if not args.pack:
logging.critical("setup/create needs the pack filename")
sys.exit(1)
pack = Path(args.pack[0])
target = Path(args.target[0])
if target.exists():
logging.critical("Target directory exists")
sys.exit(1)
if not issubclass(DefaultAbstractPath, PosixPath):
logging.critical("Not unpacking on POSIX system")
sys.exit(1)
signals.pre_setup(target=target, pack=pack)
# We can only restore owner/group of files if running as root
restore_owner = should_restore_owner(args.restore_owner)
# Unpacks configuration file
rpz_pack = RPZPack(pack)
rpz_pack.extract_config(target / 'config.yml')
# Loads config
config = load_config_file(target / 'config.yml', True)
packages = config.packages
target.mkdir()
root = (target / 'root').absolute()
root.mkdir()
try:
# Checks that everything was packed
packages_not_packed = [pkg for pkg in packages if not pkg.packfiles]
if packages_not_packed:
record_usage(chroot_missing_pkgs=True)
logging.warning("According to configuration, some files were left "
"out because they belong to the following "
"packages:%s\nWill copy files from HOST SYSTEM",
''.join('\n %s' % pkg
for pkg in packages_not_packed))
missing_files = False
for pkg in packages_not_packed:
for f in pkg.files:
path = Path(f.path)
if not path.exists():
logging.error(
"Missing file %s (from package %s) on host, "
"experiment will probably miss it",
path, pkg.name)
missing_files = True
continue
dest = join_root(root, path)
dest.parent.mkdir(parents=True)
if path.is_link():
dest.symlink(path.read_link())
else:
path.copy(dest)
if restore_owner:
stat = path.stat()
dest.chown(stat.st_uid, stat.st_gid)
if missing_files:
record_usage(chroot_mising_files=True)
# Unpacks files
members = rpz_pack.list_data()
for m in members:
# Remove 'DATA/' prefix
m.name = str(rpz_pack.remove_data_prefix(m.name))
if not restore_owner:
uid = os.getuid()
gid = os.getgid()
for m in members:
m.uid = uid
m.gid = gid
logging.info("Extracting files...")
rpz_pack.extract_data(root, members)
rpz_pack.close()
resolvconf_src = Path('/etc/resolv.conf')
if resolvconf_src.exists():
try:
resolvconf_src.copy(root / 'etc/resolv.conf')
except IOError:
pass
# Sets up /bin/sh and /usr/bin/env, downloading busybox if necessary
sh_path = join_root(root, Path('/bin/sh'))
env_path = join_root(root, Path('/usr/bin/env'))
if not sh_path.lexists() or not env_path.lexists():
logging.info("Setting up busybox...")
busybox_path = join_root(root, Path('/bin/busybox'))
busybox_path.parent.mkdir(parents=True)
with make_dir_writable(join_root(root, Path('/bin'))):
download_file(busybox_url(config.runs[0]['architecture']),
busybox_path,
'busybox-%s' % config.runs[0]['architecture'])
busybox_path.chmod(0o755)
if not sh_path.lexists():
sh_path.parent.mkdir(parents=True)
sh_path.symlink('/bin/busybox')
if not env_path.lexists():
env_path.parent.mkdir(parents=True)
env_path.symlink('/bin/busybox')
# Original input files, so upload can restore them
input_files = [f.path for f in itervalues(config.inputs_outputs)
if f.read_runs]
if input_files:
logging.info("Packing up original input files...")
inputtar = tarfile.open(str(target / 'inputs.tar.gz'), 'w:gz')
for ifile in input_files:
filename = join_root(root, ifile)
if filename.exists():
inputtar.add(str(filename), str(ifile))
inputtar.close()
# Meta-data for reprounzip
metadata_write(target, metadata_initial_iofiles(config), 'chroot')
signals.post_setup(target=target, pack=pack)
except Exception:
rmtree_fixed(root)
raise
@target_must_exist
def chroot_mount(args):
"""Mounts /dev and /proc inside the chroot directory.
"""
target = Path(args.target[0])
unpacked_info = metadata_read(target, 'chroot')
# Create proc mount
d = target / 'root/proc'
d.mkdir(parents=True)
subprocess.check_call(['mount', '-t', 'proc', 'none', str(d)])
# Bind /dev from host
for m in ('/dev', '/dev/pts'):
d = join_root(target / 'root', Path(m))
d.mkdir(parents=True)
logging.info("Mounting %s on %s...", m, d)
subprocess.check_call(['mount', '-o', 'bind', m, str(d)])
unpacked_info['mounted'] = True
metadata_write(target, unpacked_info, 'chroot')
logging.warning("The host's /dev and /proc have been mounted into the "
"chroot. Do NOT remove the unpacked directory with "
"rm -rf, it WILL WIPE the host's /dev directory.")
@target_must_exist
def chroot_run(args):
"""Runs the command in the chroot.
"""
target = Path(args.target[0])
unpacked_info = metadata_read(target, 'chroot')
cmdline = args.cmdline
# Loads config
config = load_config_file(target / 'config.yml', True)
runs = config.runs
selected_runs = get_runs(runs, args.run, cmdline)
root = target / 'root'
# X11 handler
x11 = X11Handler(args.x11, ('local', socket.gethostname()),
args.x11_display)
cmds = []
for run_number in selected_runs:
run = runs[run_number]
cmd = 'cd %s && ' % shell_escape(run['workingdir'])
cmd += '/usr/bin/env -i '
environ = x11.fix_env(run['environ'])
environ = fixup_environment(environ, args)
cmd += ' '.join('%s=%s' % (shell_escape(k), shell_escape(v))
for k, v in iteritems(environ))
cmd += ' '
# FIXME : Use exec -a or something if binary != argv[0]
if cmdline is None:
argv = [run['binary']] + run['argv'][1:]
else:
argv = cmdline
cmd += ' '.join(shell_escape(a) for a in argv)
userspec = '%s:%s' % (run.get('uid', 1000),
run.get('gid', 1000))
cmd = 'chroot --userspec=%s %s /bin/sh -c %s' % (
userspec,
shell_escape(unicode_(root)),
shell_escape(cmd))
cmds.append(cmd)
cmds = ['chroot %s /bin/sh -c %s' % (shell_escape(unicode_(root)),
shell_escape(c))
for c in x11.init_cmds] + cmds
cmds = ' && '.join(cmds)
# Starts forwarding
forwarders = []
for portnum, connector in x11.port_forward:
fwd = LocalForwarder(connector, portnum)
forwarders.append(fwd)
signals.pre_run(target=target)
retcode = interruptible_call(cmds, shell=True)
stderr.write("\n*** Command finished, status: %d\n" % retcode)
signals.post_run(target=target, retcode=retcode)
# Update input file status
metadata_update_run(config, unpacked_info, selected_runs)
metadata_write(target, unpacked_info, 'chroot')
def chroot_unmount(target):
"""Unmount magic directories, if they are mounted.
"""
unpacked_info = metadata_read(target, 'chroot')
mounted = unpacked_info.get('mounted', False)
if not mounted:
return False
target = target.resolve()
for m in ('/dev', '/proc'):
d = join_root(target / 'root', Path(m))
if d.exists():
logging.info("Unmounting %s...", d)
# Unmounts recursively
subprocess.check_call(
'grep %s /proc/mounts | '
'cut -f2 -d" " | '
'sort -r | '
'xargs umount' % d,
shell=True)
unpacked_info['mounted'] = False
metadata_write(target, unpacked_info, 'chroot')
return True
@target_must_exist
def chroot_destroy_unmount(args):
"""Unmounts the bound magic dirs.
"""
target = Path(args.target[0])
if not chroot_unmount(target):
logging.critical("Magic directories were not mounted")
sys.exit(1)
@target_must_exist
def chroot_destroy_dir(args):
"""Destroys the directory.
"""
target = Path(args.target[0])
mounted = metadata_read(target, 'chroot').get('mounted', False)
if mounted:
logging.critical("Magic directories might still be mounted")
sys.exit(1)
logging.info("Removing directory %s...", target)
signals.pre_destroy(target=target)
rmtree_fixed(target)
signals.post_destroy(target=target)
@target_must_exist
def chroot_destroy(args):
"""Destroys the directory, unmounting first if necessary.
"""
target = Path(args.target[0])
chroot_unmount(target)
logging.info("Removing directory %s...", target)
signals.pre_destroy(target=target)
rmtree_fixed(target)
signals.post_destroy(target=target)
class LocalUploader(FileUploader):
def __init__(self, target, input_files, files, type_, param_restore_owner):
self.type = type_
self.param_restore_owner = param_restore_owner
FileUploader.__init__(self, target, input_files, files)
def prepare_upload(self, files):
self.restore_owner = (self.type == 'chroot' and
should_restore_owner(self.param_restore_owner))
self.root = (self.target / 'root').absolute()
def extract_original_input(self, input_name, input_path, temp):
tar = tarfile.open(str(self.target / 'inputs.tar.gz'), 'r:*')
member = tar.getmember(str(join_root(PosixPath(''), input_path)))
member = copy.copy(member)
member.name = str(temp.components[-1])
tar.extract(member, str(temp.parent))
tar.close()
return temp
def upload_file(self, local_path, input_path):
remote_path = join_root(self.root, input_path)
# Copy
orig_stat = remote_path.stat()
with make_dir_writable(remote_path.parent):
local_path.copyfile(remote_path)
remote_path.chmod(orig_stat.st_mode & 0o7777)
if self.restore_owner:
remote_path.chown(orig_stat.st_uid, orig_stat.st_gid)
@target_must_exist
def upload(args):
"""Replaces an input file in the directory.
"""
target = Path(args.target[0])
files = args.file
unpacked_info = metadata_read(target, args.type)
input_files = unpacked_info.setdefault('input_files', {})
try:
LocalUploader(target, input_files, files,
args.type, args.type == 'chroot' and args.restore_owner)
finally:
metadata_write(target, unpacked_info, args.type)
class LocalDownloader(FileDownloader):
def __init__(self, target, files, type_, all_=False):
self.type = type_
FileDownloader.__init__(self, target, files, all_=all_)
def prepare_download(self, files):
self.root = (self.target / 'root').absolute()
def download_and_print(self, remote_path):
remote_path = join_root(self.root, remote_path)
# Output to stdout
if not remote_path.exists():
logging.critical("Can't get output file (doesn't exist): %s",
remote_path)
return False
with remote_path.open('rb') as fp:
copyfile(fp, stdout_bytes)
return True
def download(self, remote_path, local_path):
remote_path = join_root(self.root, remote_path)
# Copy
if not remote_path.exists():
logging.critical("Can't get output file (doesn't exist): %s",
remote_path)
return False
remote_path.copyfile(local_path)
remote_path.copymode(local_path)
return True
@target_must_exist
def download(args):
"""Gets an output file from the directory.
"""
target = Path(args.target[0])
files = args.file
metadata_read(target, args.type)
LocalDownloader(target, files, args.type, all_=args.all)
def test_same_pkgmngr(pack, config, **kwargs):
"""Compatibility test: platform is Linux and uses same package manager.
"""
runs, packages, other_files = config
orig_distribution = runs[0]['distribution'][0].lower()
if not THIS_DISTRIBUTION:
return COMPAT_NO, "This machine is not running Linux"
elif THIS_DISTRIBUTION == orig_distribution:
return COMPAT_OK
else:
return COMPAT_NO, "Different distributions. Then: %s, now: %s" % (
orig_distribution, THIS_DISTRIBUTION)
def test_linux_same_arch(pack, config, **kwargs):
"""Compatibility test: this platform is Linux and arch is compatible.
"""
runs, packages, other_files = config
orig_architecture = runs[0]['architecture']
current_architecture = platform.machine().lower()
if platform.system().lower() != 'linux':
return COMPAT_NO, "This machine is not running Linux"
elif (orig_architecture == current_architecture or
(orig_architecture == 'i386' and current_architecture == 'amd64')):
return COMPAT_OK
else:
return COMPAT_NO, "Different architectures. Then: %s, now: %s" % (
orig_architecture, current_architecture)
def setup_installpkgs(parser):
"""Installs the required packages on this system
"""
parser.add_argument('pack', nargs=1, help="Pack to process")
parser.add_argument(
'-y', '--assume-yes', action='store_true', default=False,
help="Assumes yes for package manager's questions (if supported)")
parser.add_argument(
'--missing', action='store_true',
help="Only install packages that weren't packed")
parser.add_argument(
'--summary', action='store_true',
help="Don't install, print which packages are installed or not")
parser.set_defaults(func=installpkgs)
return {'test_compatibility': test_same_pkgmngr}
def setup_directory(parser, **kwargs):
"""Unpacks the files in a directory and runs with PATH and LD_LIBRARY_PATH
setup creates the directory (needs the pack filename)
upload replaces input files in the directory
(without arguments, lists input files)
run runs the experiment
download gets output files
(without arguments, lists output files)
destroy removes the unpacked directory
Upload specifications are either:
:input_id restores the original input file from the pack
filename:input_id replaces the input file with the specified local
file
Download specifications are either:
output_id: print the output file to stdout
output_id:filename extracts the output file to the corresponding local
path
"""
subparsers = parser.add_subparsers(title="actions",
metavar='', help=argparse.SUPPRESS)
def add_opt_general(opts):
opts.add_argument('target', nargs=1, help="Experiment directory")
# setup
parser_setup = subparsers.add_parser('setup')
parser_setup.add_argument('pack', nargs=1, help="Pack to extract")
# Note: add_opt_general is called later so that 'pack' is before 'target'
add_opt_general(parser_setup)
parser_setup.set_defaults(func=directory_create)
# upload
parser_upload = subparsers.add_parser('upload')
add_opt_general(parser_upload)
parser_upload.add_argument('file', nargs=argparse.ZERO_OR_MORE,
help=":")
parser_upload.set_defaults(func=upload, type='directory')
# run
parser_run = subparsers.add_parser('run')
add_opt_general(parser_run)
parser_run.add_argument('run', default=None, nargs=argparse.OPTIONAL)
parser_run.add_argument('--cmdline', nargs=argparse.REMAINDER,
help="Command line to run")
parser_run.add_argument('--enable-x11', action='store_true', default=False,
dest='x11',
help="Enable X11 support (needs an X server)")
add_environment_options(parser_run)
parser_run.set_defaults(func=directory_run)
# download
parser_download = subparsers.add_parser('download')
add_opt_general(parser_download)
parser_download.add_argument('file', nargs=argparse.ZERO_OR_MORE,
help="[:]")
parser_download.add_argument('--all', action='store_true',
help="Download all output files to the "
"current directory")
parser_download.set_defaults(func=download, type='directory')
# destroy
parser_destroy = subparsers.add_parser('destroy')
add_opt_general(parser_destroy)
parser_destroy.set_defaults(func=directory_destroy)
return {'test_compatibility': test_linux_same_arch}
def chroot_setup(args):
"""Does both create and mount depending on --bind-magic-dirs.
"""
do_mount = should_mount_magic_dirs(args.bind_magic_dirs)
chroot_create(args)
if do_mount:
chroot_mount(args)
def setup_chroot(parser, **kwargs):
"""Unpacks the files and run with chroot
setup/create creates the directory (needs the pack filename)
setup/mount mounts --bind /dev and /proc inside the chroot
(do NOT rm -Rf the directory after that!)
upload replaces input files in the directory
(without arguments, lists input files)
run runs the experiment
download gets output files
(without arguments, lists output files)
destroy/unmount unmounts /dev and /proc from the directory
destroy/dir removes the unpacked directory
Upload specifications are either:
:input_id restores the original input file from the pack
filename:input_id replaces the input file with the specified local
file
Download specifications are either:
output_id: print the output file to stdout
output_id:filename extracts the output file to the corresponding local
path
"""
subparsers = parser.add_subparsers(title="actions",
metavar='', help=argparse.SUPPRESS)
def add_opt_general(opts):
opts.add_argument('target', nargs=1, help="Experiment directory")
# setup/create
def add_opt_setup(opts):
opts.add_argument('pack', nargs=1, help="Pack to extract")
def add_opt_owner(opts):
opts.add_argument('--preserve-owner', action='store_true',
dest='restore_owner', default=None,
help="Restore files' owner/group when extracting")
opts.add_argument('--dont-preserve-owner', action='store_false',
dest='restore_owner', default=None,
help="Don't restore files' owner/group when "
"extracting, use current users")
parser_setup_create = subparsers.add_parser('setup/create')
add_opt_setup(parser_setup_create)
add_opt_general(parser_setup_create)
add_opt_owner(parser_setup_create)
parser_setup_create.set_defaults(func=chroot_create)
# setup/mount
parser_setup_mount = subparsers.add_parser('setup/mount')
add_opt_general(parser_setup_mount)
parser_setup_mount.set_defaults(func=chroot_mount)
# setup
parser_setup = subparsers.add_parser('setup')
add_opt_setup(parser_setup)
add_opt_general(parser_setup)
add_opt_owner(parser_setup)
parser_setup.add_argument(
'--bind-magic-dirs', action='store_true',
dest='bind_magic_dirs', default=None,
help="Mount /dev and /proc inside the chroot")
parser_setup.add_argument(
'--dont-bind-magic-dirs', action='store_false',
dest='bind_magic_dirs', default=None,
help="Don't mount /dev and /proc inside the chroot")
parser_setup.set_defaults(func=chroot_setup)
# upload
parser_upload = subparsers.add_parser('upload')
add_opt_general(parser_upload)
add_opt_owner(parser_upload)
parser_upload.add_argument('file', nargs=argparse.ZERO_OR_MORE,
help=":")
parser_upload.set_defaults(func=upload, type='chroot')
# run
parser_run = subparsers.add_parser('run')
add_opt_general(parser_run)
parser_run.add_argument('run', default=None, nargs=argparse.OPTIONAL)
parser_run.add_argument('--cmdline', nargs=argparse.REMAINDER,
help="Command line to run")
parser_run.add_argument('--enable-x11', action='store_true', default=False,
dest='x11',
help="Enable X11 support (needs an X server on "
"the host)")
parser_run.add_argument('--x11-display', dest='x11_display',
help="Display number to use on the experiment "
"side (change the host display with the "
"DISPLAY environment variable)")
add_environment_options(parser_run)
parser_run.set_defaults(func=chroot_run)
# download
parser_download = subparsers.add_parser('download')
add_opt_general(parser_download)
parser_download.add_argument('file', nargs=argparse.ZERO_OR_MORE,
help="[:]")
parser_download.add_argument('--all', action='store_true',
help="Download all output files to the "
"current directory")
parser_download.set_defaults(func=download, type='chroot')
# destroy/unmount
parser_destroy_unmount = subparsers.add_parser('destroy/unmount')
add_opt_general(parser_destroy_unmount)
parser_destroy_unmount.set_defaults(func=chroot_destroy_unmount)
# destroy/dir
parser_destroy_dir = subparsers.add_parser('destroy/dir')
add_opt_general(parser_destroy_dir)
parser_destroy_dir.set_defaults(func=chroot_destroy_dir)
# destroy
parser_destroy = subparsers.add_parser('destroy')
add_opt_general(parser_destroy)
parser_destroy.set_defaults(func=chroot_destroy)
return {'test_compatibility': test_linux_same_arch}
reprounzip-1.0.10/reprounzip/unpackers/graph.py 0000644 0000765 0000024 00000072362 13127776450 022325 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
"""Graph plugin for reprounzip.
This is not actually an unpacker, it just creates a graph from the metadata
collected by the reprozip tracer (either from a pack file or the initial .rpz
directory).
It creates a file in GraphViz DOT format, which can be turned into an image by
using the dot utility.
See http://www.graphviz.org/
"""
from __future__ import division, print_function, unicode_literals
import argparse
from distutils.version import LooseVersion
import heapq
import json
import logging
import re
from rpaths import PosixPath, Path
import sqlite3
import sys
from reprounzip.common import FILE_READ, FILE_WRITE, FILE_WDIR, RPZPack, \
load_config
from reprounzip.orderedset import OrderedSet
from reprounzip.unpackers.common import COMPAT_OK, COMPAT_NO
from reprounzip.utils import PY3, izip, iteritems, itervalues, stderr, \
unicode_, escape, normalize_path
C_INITIAL = 0 # First process or don't know
C_FORK = 1 # Might actually be any one of fork, vfork or clone
C_EXEC = 2 # Replaced image with execve
C_FORKEXEC = 3 # A fork then an exec, folded as one because all_forks==False
FORMAT_DOT = 0
FORMAT_JSON = 1
LVL_PKG_FILE = 0 # Show individual files in packages
LVL_PKG_PACKAGE = 1 # Aggregate by package
LVL_PKG_IGNORE = 2 # Ignore packages, treat them like any file
LVL_PKG_DROP = 3 # Drop every file that comes from a package
LVL_PROC_THREAD = 0 # Show every process and thread
LVL_PROC_PROCESS = 1 # Only show processes, not threads
LVL_PROC_RUN = 2 # Don't show individual processes, aggregate by run
LVL_OTHER_ALL = 0 # Show every file, aggregate through directory list
LVL_OTHER_IO = 1 # Only show input & output files
LVL_OTHER_NO = 3 # Don't show other files
class Run(object):
"""Structure representing a whole run.
"""
def __init__(self, nb):
self.nb = nb
self.name = "run %d" % nb
self.processes = []
def dot(self, fp, level_processes):
assert self.processes
if level_processes == LVL_PROC_RUN:
fp.write(' run%d [label="%d: %s"];\n' % (
self.nb, self.nb, self.processes[0].binary or "-"))
else:
fp.write(' subgraph cluster_run%d {\n label="%s";\n' % (
self.nb, escape(self.name)))
for process in self.processes:
if level_processes == LVL_PROC_THREAD or not process.thread:
process.dot(fp, level_processes, indent=2)
fp.write(' }\n')
def dot_endpoint(self, level_processes):
return 'run%d' % self.nb
def json(self, prog_map, level_processes):
assert self.processes
if level_processes == LVL_PROC_RUN:
json_process = self.processes[0].json()
for process in self.processes:
prog_map[process] = json_process
processes = [json_process]
else:
processes = []
process_idx_map = {}
for process in self.processes:
if level_processes == LVL_PROC_THREAD or not process.thread:
process_idx_map[process] = len(processes)
json_process = process.json(process_idx_map)
prog_map[process] = json_process
processes.append(json_process)
else:
p_process = process
while p_process.thread:
p_process = p_process.parent
prog_map[process] = prog_map[p_process]
return {'name': self.name, 'processes': processes}
class Process(object):
"""Structure representing a process in the experiment.
"""
_id_gen = 0
def __init__(self, pid, run, parent, timestamp, thread, acted, binary,
argv, created):
self.id = Process._id_gen
Process._id_gen += 1
self.pid = pid
self.run = run
self.parent = parent
self.timestamp = timestamp
self.thread = bool(thread)
# Whether that process has done something yet. If it execve()s and
# hasn't done anything since it forked, no need for it to appear
self.acted = acted
# Executable file
self.binary = binary
# Command-line if this was created by an exec
self.argv = argv
# How was this process created, one of the C_* constants
self.created = created
def dot(self, fp, level_processes, indent=1):
thread_style = ',fillcolor="#666666"' if self.thread else ''
fp.write(' ' * indent + 'prog%d [label="%s (%d)"%s];\n' % (
self.id, escape(unicode_(self.binary) or "-"),
self.pid, thread_style))
if self.parent is not None:
reason = ''
if self.created == C_FORK:
if self.thread:
reason = "thread"
else:
reason = "fork"
elif self.created == C_EXEC:
reason = "exec"
elif self.created == C_FORKEXEC:
reason = "fork+exec"
fp.write(' ' * indent + 'prog%d -> prog%d [label="%s"];\n' % (
self.parent.id, self.id, reason))
def dot_endpoint(self, level_processes):
if level_processes == LVL_PROC_RUN:
return self.run.dot_endpoint(level_processes)
else:
prog = self
if level_processes == LVL_PROC_PROCESS:
while prog.thread:
prog = prog.parent
return 'prog%d' % prog.id
def json(self, process_map):
name = "%d" % self.pid
long_name = "%s (%d)" % (PosixPath(self.binary).components[-1]
if self.binary else "-",
self.pid)
description = "%s\n%d" % (self.binary, self.pid)
if self.parent is not None:
if self.created == C_FORK:
reason = "fork"
elif self.created == C_EXEC:
reason = "exec"
elif self.created == C_FORKEXEC:
reason = "fork+exec"
else:
assert False
parent = [process_map[self.parent], reason]
else:
parent = None
return {'name': name, 'parent': parent, 'reads': [], 'writes': [],
'long_name': long_name, 'description': description,
'argv': self.argv, 'is_thread': self.thread,
'start_time': self.timestamp}
class Package(object):
"""Structure representing a system package.
"""
def __init__(self, name, version=None):
self.id = None
self.name = name
self.version = version
self.files = set()
def dot(self, fp, level_pkgs):
assert self.id is not None
if not self.files:
return
if level_pkgs == LVL_PKG_PACKAGE:
fp.write(' "pkg %s" [shape=box,label=' % escape(self.name))
if self.version:
fp.write('"%s %s"];\n' % (
escape(self.name), escape(self.version)))
else:
fp.write('"%s"];\n' % escape(self.name))
elif level_pkgs == LVL_PKG_FILE:
fp.write(' subgraph cluster_pkg%d {\n label=' % self.id)
if self.version:
fp.write('"%s %s";\n' % (
escape(self.name), escape(self.version)))
else:
fp.write('"%s";\n' % escape(self.name))
for f in sorted(unicode_(f) for f in self.files):
fp.write(' "%s";\n' % escape(f))
fp.write(' }\n')
def dot_endpoint(self, f, level_pkgs):
if level_pkgs == LVL_PKG_PACKAGE:
return '"pkg %s"' % escape(self.name)
else:
return '"%s"' % escape(unicode_(f))
def json_endpoint(self, f, level_pkgs):
if level_pkgs == LVL_PKG_PACKAGE:
return self.name
else:
return unicode_(f)
def json(self, level_pkgs):
if level_pkgs == LVL_PKG_PACKAGE:
logging.critical("JSON output doesn't support --packages package")
sys.exit(1)
elif level_pkgs == LVL_PKG_FILE:
files = sorted(unicode_(f) for f in self.files)
else:
assert False
return {'name': self.name, 'version': self.version or None,
'files': files}
def parse_levels(level_pkgs, level_processes, level_other_files):
try:
level_pkgs = {'file': LVL_PKG_FILE,
'files': LVL_PKG_FILE,
'package': LVL_PKG_PACKAGE,
'packages': LVL_PKG_PACKAGE,
'ignore': LVL_PKG_IGNORE,
'drop': LVL_PKG_DROP}[level_pkgs]
except KeyError:
logging.critical("Unknown level of detail for packages: '%s'",
level_pkgs)
sys.exit(1)
try:
level_processes = {'thread': LVL_PROC_THREAD,
'threads': LVL_PROC_THREAD,
'process': LVL_PROC_PROCESS,
'processes': LVL_PROC_PROCESS,
'run': LVL_PROC_RUN,
'runs': LVL_PROC_RUN}[level_processes]
except KeyError:
logging.critical("Unknown level of detail for processes: '%s'",
level_processes)
sys.exit(1)
if level_other_files.startswith('depth:'):
file_depth = int(level_other_files[6:])
level_other_files = 'all'
else:
file_depth = None
try:
level_other_files = {'all': LVL_OTHER_ALL,
'io': LVL_OTHER_IO,
'inputoutput': LVL_OTHER_IO,
'no': LVL_OTHER_NO,
'none': LVL_OTHER_NO,
'drop': LVL_OTHER_NO}[level_other_files]
except KeyError:
logging.critical("Unknown level of detail for other files: '%s'",
level_other_files)
sys.exit(1)
return level_pkgs, level_processes, level_other_files, file_depth
def read_events(database, all_forks, has_thread_flag):
# In here, a file is any file on the filesystem. A binary is a file, that
# gets executed. A process is a system-level task, identified by its pid
# (pids don't get reused in the database).
# What I call program is the couple (process, binary), so forking creates a
# new program (with the same binary) and exec'ing creates a new program as
# well (with the same process)
# Because of this, fork+exec will create an intermediate program that
# doesn't do anything (new process but still old binary). If that program
# doesn't do anything worth showing on the graph, it will be erased, unless
# all_forks is True (--all-forks).
if PY3:
# On PY3, connect() only accepts unicode
conn = sqlite3.connect(str(database))
else:
conn = sqlite3.connect(database.path)
conn.row_factory = sqlite3.Row
# This is a bit weird. We need to iterate on all types of events at the
# same time, ordering by timestamp, so we decorate-sort-undecorate
# Decoration adds timestamp (for sorting) and tags by event type, one of
# 'process', 'open' or 'exec'
# Reads processes from the database
process_cursor = conn.cursor()
if has_thread_flag:
sql = '''
SELECT id, parent, timestamp, is_thread
FROM processes
ORDER BY id
'''
else:
sql = '''
SELECT id, parent, timestamp, 0 as is_thread
FROM processes
ORDER BY id
'''
process_rows = process_cursor.execute(sql)
processes = {}
all_programs = []
# ... and opened files...
file_cursor = conn.cursor()
file_rows = file_cursor.execute(
'''
SELECT name, timestamp, mode, process, is_directory
FROM opened_files
ORDER BY id
''')
binaries = set()
files = set()
edges = OrderedSet()
# ... as well as executed files.
exec_cursor = conn.cursor()
exec_rows = exec_cursor.execute(
'''
SELECT name, timestamp, process, argv
FROM executed_files
ORDER BY id
''')
# Loop on all event lists
logging.info("Getting all events from database...")
rows = heapq.merge(((r[2], 'process', r) for r in process_rows),
((r[1], 'open', r) for r in file_rows),
((r[1], 'exec', r) for r in exec_rows))
runs = []
run = None
for ts, event_type, data in rows:
if event_type == 'process':
r_id, r_parent, r_timestamp, r_thread = data
logging.debug("Process %d created (parent %r)", r_id, r_parent)
if r_parent is not None:
parent = processes[r_parent]
binary = parent.binary
else:
run = Run(len(runs))
runs.append(run)
parent = None
binary = None
if r_parent is not None:
argv = processes[r_parent].argv
else:
argv = None
process = Process(r_id,
run,
parent,
r_timestamp,
r_thread,
False,
binary,
argv,
C_INITIAL if r_parent is None else C_FORK)
processes[r_id] = process
all_programs.append(process)
run.processes.append(process)
elif event_type == 'open':
r_name, r_timestamp, r_mode, r_process, r_directory = data
r_name = normalize_path(r_name)
logging.debug("File open: %s, process %d", r_name, r_process)
if not (r_mode & FILE_WDIR or r_directory):
process = processes[r_process]
files.add(r_name)
edges.add((process, r_name, r_mode, None))
elif event_type == 'exec':
r_name, r_timestamp, r_process, r_argv = data
r_name = normalize_path(r_name)
argv = tuple(r_argv.split('\0'))
if not argv[-1]:
argv = argv[:-1]
logging.debug("File exec: %s, process %d", r_name, r_process)
process = processes[r_process]
binaries.add(r_name)
# Here we split this process in two "programs", unless the previous
# one hasn't done anything since it was created via fork()
if not all_forks and not process.acted:
process.binary = r_name
process.created = C_FORKEXEC
process.acted = True
process.argv = argv
else:
process = Process(process.pid,
run,
process,
r_timestamp,
False,
True, # Hides exec only once
r_name,
argv,
C_EXEC)
all_programs.append(process)
processes[r_process] = process
run.processes.append(process)
files.add(r_name)
edges.add((process, r_name, None, argv))
process_cursor.close()
file_cursor.close()
exec_cursor.close()
conn.close()
return runs, files, edges
def format_argv(argv):
joined = ' '.join(argv)
if len(joined) < 50:
return joined
else:
return "%s ..." % argv[0]
def generate(target, configfile, database, all_forks=False, graph_format='dot',
level_pkgs='file', level_processes='thread',
level_other_files='all',
regex_filters=None, regex_replaces=None, aggregates=None):
"""Main function for the graph subcommand.
"""
try:
graph_format = {'dot': FORMAT_DOT, 'DOT': FORMAT_DOT,
'json': FORMAT_JSON, 'JSON': FORMAT_JSON}[graph_format]
except KeyError:
logging.critical("Unknown output format %r", graph_format)
sys.exit(1)
level_pkgs, level_processes, level_other_files, file_depth = \
parse_levels(level_pkgs, level_processes, level_other_files)
# Reads package ownership from the configuration
if not configfile.is_file():
logging.critical("Configuration file does not exist!\n"
"Did you forget to run 'reprozip trace'?\n"
"If not, you might want to use --dir to specify an "
"alternate location.")
sys.exit(1)
config = load_config(configfile, canonical=False)
inputs_outputs = config.inputs_outputs
inputs_outputs_map = dict((f.path, n)
for n, f in iteritems(config.inputs_outputs))
has_thread_flag = config.format_version >= LooseVersion('0.7')
runs, files, edges = read_events(database, all_forks,
has_thread_flag)
# Label the runs
if len(runs) != len(config.runs):
logging.warning("Configuration file doesn't list the same number of "
"runs we found in the database!")
else:
for config_run, run in izip(config.runs, runs):
run.name = config_run['id']
# Apply regexes
ignore = [lambda path, r=re.compile(p): r.search(path) is not None
for p in regex_filters or []]
replace = [lambda path, r=re.compile(p): r.sub(repl, path)
for p, repl in regex_replaces or []]
def filefilter(path):
pathuni = unicode_(path)
if any(f(pathuni) for f in ignore):
logging.debug("IGN %s", pathuni)
return None
if not (replace or aggregates):
return path
for fi in replace:
pathuni_ = fi(pathuni)
if pathuni_ != pathuni:
logging.debug("SUB %s -> %s", pathuni, pathuni_)
pathuni = pathuni_
for prefix in aggregates or []:
if pathuni.startswith(prefix):
logging.debug("AGG %s -> %s", pathuni, prefix)
pathuni = prefix
break
return PosixPath(pathuni)
files_new = set()
for fi in files:
fi = filefilter(fi)
if fi is not None:
files_new.add(fi)
files = files_new
edges_new = OrderedSet()
for prog, fi, mode, argv in edges:
fi = filefilter(fi)
if fi is not None:
edges_new.add((prog, fi, mode, argv))
edges = edges_new
# Puts files in packages
package_map = {}
if level_pkgs == LVL_PKG_IGNORE:
packages = []
other_files = files
else:
logging.info("Organizes packages...")
file2package = dict((f.path, pkg)
for pkg in config.packages for f in pkg.files)
packages = {}
other_files = []
for fi in files:
pkg = file2package.get(fi)
if pkg is not None:
package = packages.get(pkg.name)
if package is None:
package = Package(pkg.name, pkg.version)
packages[pkg.name] = package
package.files.add(fi)
package_map[fi] = package
else:
other_files.append(fi)
packages = sorted(itervalues(packages), key=lambda pkg: pkg.name)
for i, pkg in enumerate(packages):
pkg.id = i
# Filter other files
if level_other_files == LVL_OTHER_ALL and file_depth is not None:
other_files = set(PosixPath(*f.components[:file_depth + 1])
for f in other_files)
edges = OrderedSet((prog,
f if f in package_map
else PosixPath(*f.components[:file_depth + 1]),
mode,
argv)
for prog, f, mode, argv in edges)
else:
if level_other_files == LVL_OTHER_IO:
other_files = set(f
for f in other_files if f in inputs_outputs_map)
edges = [(prog, f, mode, argv)
for prog, f, mode, argv in edges
if f in package_map or f in other_files]
elif level_other_files == LVL_OTHER_NO:
other_files = set()
edges = [(prog, f, mode, argv)
for prog, f, mode, argv in edges
if f in package_map]
args = (target, runs, packages, other_files, package_map, edges,
inputs_outputs, inputs_outputs_map,
level_pkgs, level_processes, level_other_files)
if graph_format == FORMAT_DOT:
graph_dot(*args)
elif graph_format == FORMAT_JSON:
graph_json(*args)
else:
assert False
def graph_dot(target, runs, packages, other_files, package_map, edges,
inputs_outputs, inputs_outputs_map,
level_pkgs, level_processes, level_other_files):
"""Writes a GraphViz DOT file from the collected information.
"""
with target.open('w', encoding='utf-8', newline='\n') as fp:
fp.write('digraph G {\n /* programs */\n'
' node [shape=box fontcolor=white '
'fillcolor=black style=filled];\n')
# Programs
logging.info("Writing programs...")
for run in runs:
run.dot(fp, level_processes)
fp.write('\n'
' node [shape=ellipse fontcolor="#131C39" '
'fillcolor="#C9D2ED"];\n')
# Packages
if level_pkgs not in (LVL_PKG_IGNORE, LVL_PKG_DROP):
logging.info("Writing packages...")
fp.write('\n /* system packages */\n')
for package in sorted(packages, key=lambda pkg: pkg.name):
package.dot(fp, level_pkgs)
fp.write('\n /* other files */\n')
# Other files
logging.info("Writing other files...")
for fi in sorted(other_files):
if fi in inputs_outputs_map:
fp.write(' "%(path)s" [fillcolor="#A3B4E0", '
'label="%(name)s\\n%(path)s"];\n' %
{'path': escape(unicode_(fi)),
'name': inputs_outputs_map[fi]})
else:
fp.write(' "%s";\n' % escape(unicode_(fi)))
fp.write('\n')
# Edges
logging.info("Connecting edges...")
done_edges = set()
for prog, fi, mode, argv in edges:
endp_prog = prog.dot_endpoint(level_processes)
if fi in package_map:
if level_pkgs == LVL_PKG_DROP:
continue
endp_file = package_map[fi].dot_endpoint(fi, level_pkgs)
e = endp_prog, endp_file, mode
if e in done_edges:
continue
else:
done_edges.add(e)
else:
endp_file = '"%s"' % escape(unicode_(fi))
if mode is None:
fp.write(' %s -> %s [style=bold, label="%s"];\n' % (
endp_file,
endp_prog,
escape(format_argv(argv))))
elif mode & FILE_WRITE:
fp.write(' %s -> %s [color="#000088"];\n' % (
endp_prog, endp_file))
elif mode & FILE_READ:
fp.write(' %s -> %s [color="#8888CC"];\n' % (
endp_file, endp_prog))
fp.write('}\n')
def graph_json(target, runs, packages, other_files, package_map, edges,
inputs_outputs, inputs_outputs_map,
level_pkgs, level_processes, level_other_files):
"""Writes a JSON file suitable for further processing.
"""
# Packages
if level_pkgs in (LVL_PKG_IGNORE, LVL_PKG_DROP):
json_packages = []
else:
json_packages = [pkg.json(level_pkgs) for pkg in packages]
# Other files
json_other_files = [unicode_(fi) for fi in sorted(other_files)]
# Programs
prog_map = {}
json_runs = [run.json(prog_map, level_processes) for run in runs]
# Connect edges
done_edges = set()
for prog, fi, mode, argv in edges:
endp_prog = prog_map[prog]
if fi in package_map:
if level_pkgs == LVL_PKG_DROP:
continue
endp_file = package_map[fi].json_endpoint(fi, level_pkgs)
e = endp_prog['name'], endp_file, mode
if e in done_edges:
continue
else:
done_edges.add(e)
else:
endp_file = unicode_(fi)
if mode is None:
endp_prog['reads'].append(endp_file)
# TODO: argv?
elif mode & FILE_WRITE:
endp_prog['writes'].append(endp_file)
elif mode & FILE_READ:
endp_prog['reads'].append(endp_file)
json_other_files.sort()
if PY3:
fp = target.open('w', encoding='utf-8', newline='\n')
else:
fp = target.open('wb')
try:
json.dump({'packages': sorted(json_packages,
key=lambda p: p['name']),
'other_files': json_other_files,
'runs': json_runs,
'inputs_outputs': [
{'name': k, 'path': unicode_(v.path),
'read_by_runs': v.read_runs,
'written_by_runs': v.write_runs}
for k, v in sorted(iteritems(inputs_outputs))]},
fp,
ensure_ascii=False,
indent=2,
sort_keys=True)
finally:
fp.close()
def graph(args):
"""graph subcommand.
Reads in the trace sqlite3 database and writes out a graph in GraphViz DOT
format or JSON.
"""
def call_generate(args, config, trace):
generate(Path(args.target[0]), config, trace, args.all_forks,
args.format, args.packages, args.processes, args.otherfiles,
args.regex_filter, args.regex_replace, args.aggregate)
if args.pack is not None:
rpz_pack = RPZPack(args.pack)
with rpz_pack.with_config() as config:
with rpz_pack.with_trace() as trace:
call_generate(args, config, trace)
else:
call_generate(args,
Path(args.dir) / 'config.yml',
Path(args.dir) / 'trace.sqlite3')
def disabled_bug13676(args):
stderr.write("Error: your version of Python, %s, is not supported\n"
"Versions before 2.7.3 are affected by bug 13676 and will "
"not be able to read\nthe trace "
"database\n" % sys.version.split(' ', 1)[0])
sys.exit(1)
def setup(parser, **kwargs):
"""Generates a provenance graph from the trace data
"""
# http://bugs.python.org/issue13676
# This prevents repro(un)zip from reading argv and envp arrays from trace
if sys.version_info < (2, 7, 3):
parser.add_argument('rest_of_cmdline', nargs=argparse.REMAINDER,
help=argparse.SUPPRESS)
parser.set_defaults(func=disabled_bug13676)
return {'test_compatibility': (COMPAT_NO, "Python >2.7.3 required")}
parser.add_argument('target', nargs=1, help="Destination DOT file")
parser.add_argument('-F', '--all-forks', action='store_true',
help="Show forked processes before they exec")
parser.add_argument('--packages', default='file',
help="Level of detail for packages; 'file', "
"'package', 'drop' or 'ignore' (default: 'file')")
parser.add_argument('--processes', default='thread',
help="Level of detail for processes; 'thread', "
"'process' or 'run' (default: 'thread')")
parser.add_argument('--otherfiles', default='all',
help="Level of detail for non-package files; 'all', "
"'io' or 'no' (default: 'all')")
parser.add_argument('--aggregate', action='append',
help="Aggregate all files under this path")
parser.add_argument('--regex-filter', action='append',
help="Glob patterns of files to ignore")
parser.add_argument('--regex-replace', action='append', nargs=2,
help="Apply regular expression replacement to files")
parser.add_argument('--dot', action='store_const', dest='format',
const='dot', default='dot',
help="Set the output format to DOT (this is the "
"default)")
parser.add_argument('--json', action='store_const', dest='format',
const='json', help="Set the output format to JSON")
parser.add_argument(
'-d', '--dir', default='.reprozip-trace',
help="where the database and configuration file are stored (default: "
"./.reprozip-trace)")
parser.add_argument(
'pack', nargs=argparse.OPTIONAL,
help="Pack to extract (defaults to reading from --dir)")
parser.set_defaults(func=graph)
return {'test_compatibility': COMPAT_OK}
reprounzip-1.0.10/reprounzip/utils.py 0000644 0000765 0000024 00000031730 13127776450 020363 0 ustar remram staff 0000000 0000000 # Copyright (C) 2014-2017 New York University
# This file is part of ReproZip which is released under the Revised BSD License
# See file LICENSE for full license details.
# This file is shared:
# reprozip/reprozip/utils.py
# reprounzip/reprounzip/utils.py
"""Utility functions.
These functions are shared between reprozip and reprounzip but are not specific
to this software (more utilities).
"""
from __future__ import division, print_function, unicode_literals
import codecs
import contextlib
import email.utils
import itertools
import locale
import logging
import operator
import os
import requests
from rpaths import Path, PosixPath
import stat
import subprocess
import sys
class StreamWriter(object):
def __init__(self, stream):
writer = codecs.getwriter(locale.getpreferredencoding())
self._writer = writer(stream, 'replace')
self.buffer = stream
def writelines(self, lines):
self.write(str('').join(lines))
def write(self, obj):
if isinstance(obj, bytes):
self.buffer.write(obj)
else:
self._writer.write(obj)
def __getattr__(self, name,
getattr=getattr):
""" Inherit all other methods from the underlying stream.
"""
return getattr(self._writer, name)
PY3 = sys.version_info[0] == 3
if PY3:
izip = zip
irange = range
iteritems = lambda d: d.items()
itervalues = lambda d: d.values()
listvalues = lambda d: list(d.values())
stdout_bytes, stderr_bytes = sys.stdout.buffer, sys.stderr.buffer
stdin_bytes = sys.stdin.buffer
stdout, stderr = sys.stdout, sys.stderr
else:
izip = itertools.izip
irange = xrange # noqa: F821
iteritems = lambda d: d.iteritems()
itervalues = lambda d: d.itervalues()
listvalues = lambda d: d.values()
_writer = codecs.getwriter(locale.getpreferredencoding())
stdout_bytes, stderr_bytes = sys.stdout, sys.stderr
stdin_bytes = sys.stdin
stdout, stderr = StreamWriter(sys.stdout), StreamWriter(sys.stderr)
if PY3:
int_types = int,
unicode_ = str
else:
int_types = int, long # noqa: F821
unicode_ = unicode # noqa: F821
def flatten(n, l):
"""Flattens an iterable by repeatedly calling chain.from_iterable() on it.
>>> a = [[1, 2, 3], [4, 5, 6]]
>>> b = [[7, 8], [9, 10, 11, 12, 13, 14, 15, 16]]
>>> l = [a, b]
>>> list(flatten(0, a))
[[1, 2, 3], [4, 5, 6]]
>>> list(flatten(1, a))
[1, 2, 3, 4, 5, 6]
>>> list(flatten(1, l))
[[1, 2, 3], [4, 5, 6], [7, 8], [9, 10, 11, 12, 13, 14, 15, 16]]
>>> list(flatten(2, l))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
"""
for _ in irange(n):
l = itertools.chain.from_iterable(l)
return l
class UniqueNames(object):
"""Makes names unique amongst the ones it's already seen.
"""
def __init__(self):
self.names = set()
def insert(self, name):
assert name not in self.names
self.names.add(name)
def __call__(self, name):
nb = 1
attempt = name
while attempt in self.names:
nb += 1
attempt = '%s_%d' % (name, nb)
self.names.add(attempt)
return attempt
def escape(s):
"""Escapes backslashes and double quotes in strings.
This does NOT add quotes around the string.
"""
return s.replace('\\', '\\\\').replace('"', '\\"')
class CommonEqualityMixin(object):
"""Common mixin providing comparison by comparing ``__dict__`` attributes.
"""
def __eq__(self, other):
return (isinstance(other, self.__class__) and
self.__dict__ == other.__dict__)
def __ne__(self, other):
return not self.__eq__(other)
def optional_return_type(req_args, other_args):
"""Sort of namedtuple but with name-only fields.
When deconstructing a namedtuple, you have to get all the fields:
>>> o = namedtuple('T', ['a', 'b', 'c'])(1, 2, 3)
>>> a, b = o
ValueError: too many values to unpack
You thus cannot easily add new return values. This class allows it:
>>> o2 = optional_return_type(['a', 'b'], ['c'])(1, 2, 3)
>>> a, b = o2
>>> c = o2.c
"""
if len(set(req_args) | set(other_args)) != len(req_args) + len(other_args):
raise ValueError
# Maps argument name to position in each list
req_args_pos = dict((n, i) for i, n in enumerate(req_args))
other_args_pos = dict((n, i) for i, n in enumerate(other_args))
def cstr(cls, *args, **kwargs):
if len(args) > len(req_args) + len(other_args):
raise TypeError(
"Too many arguments (expected at least %d and no more than "
"%d)" % (len(req_args),
len(req_args) + len(other_args)))
args1, args2 = args[:len(req_args)], args[len(req_args):]
req = dict((i, v) for i, v in enumerate(args1))
other = dict(izip(other_args, args2))
for k, v in iteritems(kwargs):
if k in req_args_pos:
pos = req_args_pos[k]
if pos in req:
raise TypeError("Multiple values for field %s" % k)
req[pos] = v
elif k in other_args_pos:
if k in other:
raise TypeError("Multiple values for field %s" % k)
other[k] = v
else:
raise TypeError("Unknown field name %s" % k)
args = []
for i, k in enumerate(req_args):
if i not in req:
raise TypeError("Missing value for field %s" % k)
args.append(req[i])
inst = tuple.__new__(cls, args)
inst.__dict__.update(other)
return inst
dct = {'__new__': cstr}
for i, n in enumerate(req_args):
dct[n] = property(operator.itemgetter(i))
return type(str('OptionalReturnType'), (tuple,), dct)
def hsize(nbytes):
"""Readable size.
"""
if nbytes is None:
return "unknown"
KB = 1 << 10
MB = 1 << 20
GB = 1 << 30
TB = 1 << 40
PB = 1 << 50
nbytes = float(nbytes)
if nbytes < KB:
return "{0} bytes".format(nbytes)
elif nbytes < MB:
return "{0:.2f} KB".format(nbytes / KB)
elif nbytes < GB:
return "{0:.2f} MB".format(nbytes / MB)
elif nbytes < TB:
return "{0:.2f} GB".format(nbytes / GB)
elif nbytes < PB:
return "{0:.2f} TB".format(nbytes / TB)
else:
return "{0:.2f} PB".format(nbytes / PB)
def normalize_path(path):
"""Normalize a path obtained from the database.
"""
# For some reason, os.path.normpath() keeps multiple leading slashes
# We don't want this since it has no meaning on Linux
path = PosixPath(path)
if path.path.startswith(path._sep + path._sep):
path = PosixPath(path.path[1:])
return path
def find_all_links_recursive(filename, files):
path = Path('/')
for c in filename.components[1:]:
# At this point, path is a canonical path, and all links in it have
# been resolved
# We add the next path component
path = path / c
# That component is possibly a link
if path.is_link():
# Adds the link itself
files.add(path)
target = path.read_link(absolute=True)
# Here, target might contain a number of symlinks
if target not in files:
# Recurse on this new path
find_all_links_recursive(target, files)
# Restores the invariant; realpath might resolve several links here
path = path.resolve()
return path
def find_all_links(filename, include_target=False):
"""Dereferences symlinks from a path.
If include_target is True, this also returns the real path of the final
target.
Example:
/
a -> b
b
g -> c
c -> ../a/d
d
e -> /f
f
>>> find_all_links('/a/g/e', True)
['/a', '/b/c', '/b/g', '/b/d/e', '/f']
"""
files = set()
filename = Path(filename)
assert filename.absolute()
path = find_all_links_recursive(filename, files)
files = list(files)
if include_target:
files.append(path)
return files
def join_root(root, path):
"""Prepends `root` to the absolute path `path`.
"""
p_root, p_loc = path.split_root()
assert p_root == b'/'
return root / p_loc
@contextlib.contextmanager
def make_dir_writable(directory):
"""Context-manager that sets write permission on a directory.
This assumes that the directory belongs to you. If the u+w permission
wasn't set, it gets set in the context, and restored to what it was when
leaving the context. u+x also gets set on all the directories leading to
that path.
"""
uid = os.getuid()
try:
sb = directory.stat()
except OSError:
pass
else:
if sb.st_uid != uid or sb.st_mode & 0o700 == 0o700:
yield
return
# These are the permissions to be restored, in reverse order
restore_perms = []
try:
# Add u+x to all directories up to the target
path = Path('/')
for c in directory.components[1:-1]:
path = path / c
sb = path.stat()
if sb.st_uid == uid and not sb.st_mode & 0o100:
logging.debug("Temporarily setting u+x on %s", path)
restore_perms.append((path, sb.st_mode))
path.chmod(sb.st_mode | 0o700)
# Add u+wx to the target
sb = directory.stat()
if sb.st_uid == uid and sb.st_mode & 0o700 != 0o700:
logging.debug("Temporarily setting u+wx on %s", directory)
restore_perms.append((directory, sb.st_mode))
directory.chmod(sb.st_mode | 0o700)
yield
finally:
for path, mod in reversed(restore_perms):
path.chmod(mod)
def rmtree_fixed(path):
"""Like :func:`shutil.rmtree` but doesn't choke on annoying permissions.
If a directory with -w or -x is encountered, it gets fixed and deletion
continues.
"""
if path.is_link():
raise OSError("Cannot call rmtree on a symbolic link")
uid = os.getuid()
st = path.lstat()
if st.st_uid == uid and st.st_mode & 0o700 != 0o700:
path.chmod(st.st_mode | 0o700)
for entry in path.listdir():
if stat.S_ISDIR(entry.lstat().st_mode):
rmtree_fixed(entry)
else:
entry.remove()
path.rmdir()
# Compatibility with ReproZip <= 1.0.3
check_output = subprocess.check_output
def copyfile(source, destination, CHUNK_SIZE=4096):
"""Copies from one file object to another.
"""
while True:
chunk = source.read(CHUNK_SIZE)
if chunk:
destination.write(chunk)
if len(chunk) != CHUNK_SIZE:
break
def download_file(url, dest, cachename=None, ssl_verify=None):
"""Downloads a file using a local cache.
If the file cannot be downloaded or if it wasn't modified, the cached
version will be used instead.
The cache lives in ``~/.cache/reprozip/``.
"""
if cachename is None:
if dest is None:
raise ValueError("One of 'dest' or 'cachename' must be specified")
cachename = dest.components[-1]
headers = {}
if 'XDG_CACHE_HOME' in os.environ:
cache = Path(os.environ['XDG_CACHE_HOME'])
else:
cache = Path('~/.cache').expand_user()
cache = cache / 'reprozip' / cachename
if cache.exists():
mtime = email.utils.formatdate(cache.mtime(), usegmt=True)
headers['If-Modified-Since'] = mtime
cache.parent.mkdir(parents=True)
try:
response = requests.get(url, headers=headers,
timeout=2 if cache.exists() else 10,
stream=True, verify=ssl_verify)
response.raise_for_status()
if response.status_code == 304:
raise requests.HTTPError(
'304 File is up to date, no data returned',
response=response)
except requests.RequestException as e:
if cache.exists():
if e.response and e.response.status_code == 304:
logging.info("Download %s: cache is up to date", cachename)
else:
logging.warning("Download %s: error downloading %s: %s",
cachename, url, e)
if dest is not None:
cache.copy(dest)
return dest
else:
return cache
else:
raise
logging.info("Download %s: downloading %s", cachename, url)
try:
with cache.open('wb') as f:
for chunk in response.iter_content(4096):
f.write(chunk)
response.close()
except Exception as e: # pragma: no cover
try:
cache.remove()
except OSError:
pass
raise e
logging.info("Downloaded %s successfully", cachename)
if dest is not None:
cache.copy(dest)
return dest
else:
return cache
reprounzip-1.0.10/reprounzip.egg-info/ 0000755 0000765 0000024 00000000000 13130663165 020327 5 ustar remram staff 0000000 0000000 reprounzip-1.0.10/reprounzip.egg-info/PKG-INFO 0000644 0000765 0000024 00000005026 13130663165 021427 0 ustar remram staff 0000000 0000000 Metadata-Version: 1.1
Name: reprounzip
Version: 1.0.10
Summary: Linux tool enabling reproducible experiments (unpacker)
Home-page: http://vida-nyu.github.io/reprozip/
Author: Remi Rampin
Author-email: remirampin@gmail.com
License: BSD-3-Clause
Description: ReproZip
========
`ReproZip `__ is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions, a frequently-used common denominator in computational science. It tracks operating system calls and creates a package that contains all the binaries, files and dependencies required to run a given command on the author's computational environment (packing step). A reviewer can then extract the experiment in his environment to reproduce the results (unpacking step).
reprounzip
----------
This is the component responsible for the unpacking step on Linux distributions.
Please refer to `reprozip `__, `reprounzip-vagrant `_, and `reprounzip-docker `_ for other components and plugins.
A GUI is available at `reprounzip-qt `_.
Additional Information
----------------------
For more detailed information, please refer to our `website `_, as well as to our `documentation `_.
ReproZip is currently being developed at `NYU `_. The team includes:
* `Fernando Chirigati `_
* `Juliana Freire `_
* `Remi Rampin `_
* `Dennis Shasha `_
* `Vicky Steeves `_
Keywords: reprozip,reprounzip,reproducibility,provenance,vida,nyu
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: System :: Archiving
reprounzip-1.0.10/reprounzip.egg-info/SOURCES.txt 0000644 0000765 0000024 00000001366 13130663165 022221 0 ustar remram staff 0000000 0000000 LICENSE.txt
MANIFEST.in
README.rst
setup.cfg
setup.py
reprounzip/__init__.py
reprounzip/common.py
reprounzip/main.py
reprounzip/orderedset.py
reprounzip/pack_info.py
reprounzip/parameters.py
reprounzip/signals.py
reprounzip/utils.py
reprounzip.egg-info/PKG-INFO
reprounzip.egg-info/SOURCES.txt
reprounzip.egg-info/dependency_links.txt
reprounzip.egg-info/entry_points.txt
reprounzip.egg-info/namespace_packages.txt
reprounzip.egg-info/requires.txt
reprounzip.egg-info/top_level.txt
reprounzip/plugins/__init__.py
reprounzip/unpackers/__init__.py
reprounzip/unpackers/default.py
reprounzip/unpackers/graph.py
reprounzip/unpackers/common/__init__.py
reprounzip/unpackers/common/misc.py
reprounzip/unpackers/common/packages.py
reprounzip/unpackers/common/x11.py reprounzip-1.0.10/reprounzip.egg-info/dependency_links.txt 0000644 0000765 0000024 00000000001 13130663165 024375 0 ustar remram staff 0000000 0000000
reprounzip-1.0.10/reprounzip.egg-info/entry_points.txt 0000644 0000765 0000024 00000000567 13130663165 023635 0 ustar remram staff 0000000 0000000 [console_scripts]
reprounzip = reprounzip.main:main
[reprounzip.unpackers]
chroot = reprounzip.unpackers.default:setup_chroot
directory = reprounzip.unpackers.default:setup_directory
graph = reprounzip.unpackers.graph:setup
info = reprounzip.pack_info:setup_info
installpkgs = reprounzip.unpackers.default:setup_installpkgs
showfiles = reprounzip.pack_info:setup_showfiles
reprounzip-1.0.10/reprounzip.egg-info/namespace_packages.txt 0000644 0000765 0000024 00000000040 13130663165 024654 0 ustar remram staff 0000000 0000000 reprounzip
reprounzip.unpackers
reprounzip-1.0.10/reprounzip.egg-info/requires.txt 0000644 0000765 0000024 00000000174 13130663165 022731 0 ustar remram staff 0000000 0000000 PyYAML
rpaths>=0.8
usagestats>=0.3
requests
[all]
reprounzip-vagrant>=1.0
reprounzip-docker>=1.0
reprounzip-vistrails>=1.0
reprounzip-1.0.10/reprounzip.egg-info/top_level.txt 0000644 0000765 0000024 00000000013 13130663165 023053 0 ustar remram staff 0000000 0000000 reprounzip
reprounzip-1.0.10/setup.cfg 0000644 0000765 0000024 00000000103 13130663165 016233 0 ustar remram staff 0000000 0000000 [bdist_wheel]
universal = 1
[egg_info]
tag_build =
tag_date = 0
reprounzip-1.0.10/setup.py 0000644 0000765 0000024 00000004444 13127776450 016150 0 ustar remram staff 0000000 0000000 import io
import os
from setuptools import setup
# pip workaround
os.chdir(os.path.abspath(os.path.dirname(__file__)))
# Need to specify encoding for PY3, which has the worst unicode handling ever
with io.open('README.rst', encoding='utf-8') as fp:
description = fp.read()
req = [
'PyYAML',
'rpaths>=0.8',
'usagestats>=0.3',
'requests']
setup(name='reprounzip',
version='1.0.10',
packages=['reprounzip', 'reprounzip.unpackers',
'reprounzip.unpackers.common', 'reprounzip.plugins'],
entry_points={
'console_scripts': [
'reprounzip = reprounzip.main:main'],
'reprounzip.unpackers': [
'info = reprounzip.pack_info:setup_info',
'showfiles = reprounzip.pack_info:setup_showfiles',
'graph = reprounzip.unpackers.graph:setup',
'installpkgs = reprounzip.unpackers.default:setup_installpkgs',
'directory = reprounzip.unpackers.default:setup_directory',
'chroot = reprounzip.unpackers.default:setup_chroot']},
namespace_packages=['reprounzip', 'reprounzip.unpackers'],
install_requires=req,
extras_require={
'all': ['reprounzip-vagrant>=1.0', 'reprounzip-docker>=1.0',
'reprounzip-vistrails>=1.0']},
description="Linux tool enabling reproducible experiments (unpacker)",
author="Remi Rampin, Fernando Chirigati, Dennis Shasha, Juliana Freire",
author_email='reprozip-users@vgc.poly.edu',
maintainer="Remi Rampin",
maintainer_email='remirampin@gmail.com',
url='http://vida-nyu.github.io/reprozip/',
long_description=description,
license='BSD-3-Clause',
keywords=['reprozip', 'reprounzip', 'reproducibility', 'provenance',
'vida', 'nyu'],
classifiers=[
'Development Status :: 5 - Production/Stable',
'Intended Audience :: Science/Research',
'License :: OSI Approved :: BSD License',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Topic :: Scientific/Engineering',
'Topic :: System :: Archiving'])