ruby-ole-1.2.11.7/ 0000755 0000041 0000041 00000000000 12177737361 013545 5 ustar www-data www-data ruby-ole-1.2.11.7/README 0000644 0000041 0000041 00000010554 12177737361 014432 0 ustar www-data www-data = Introduction
The ruby-ole library provides a variety of functions primarily for
working with OLE2 structured storage files, such as those produced by
Microsoft Office - eg *.doc, *.msg etc.
= Example Usage
Here are some examples of how to use the library functionality,
categorised roughly by purpose.
1. Reading and writing files within an OLE container
The recommended way to manipulate the contents is via the
"file_system" API, whereby you use Ole::Storage instance methods
similar to the regular File and Dir class methods.
ole = Ole::Storage.open('oleWithDirs.ole', 'rb+')
p ole.dir.entries('.') # => [".", "..", "dir1", "dir2", "file1"]
p ole.file.read('file1')[0, 25] # => "this is the entry 'file1'"
ole.dir.mkdir('newdir')
2. Accessing OLE meta data
Some convenience functions are provided for (currently read only)
access to OLE property sets and other sources of meta data.
ole = Ole::Storage.open('test_word_95.doc')
p ole.meta_data.file_format # => "MSWordDoc"
p ole.meta_data.mime_type # => "application/msword"
p ole.meta_data.doc_author.split.first # => "Charles"
3. Raw access to underlying OLE internals
This is probably of little interest to most developers using the
library, but for some use cases you may need to drop down to the
lower level API on which the "file_system" API is constructed,
which exposes more of the format details.
Ole::Storage files can have multiple files with the same name,
or with a slash in the name, and other things that are probably
strictly invalid. This API is the only way to access those files.
You can access the header object directly:
p ole.header.num_sbat # => 1
p ole.header.magic.unpack('H*') # => ["d0cf11e0a1b11ae1"]
You can directly access the array of all Dirent objects,
including the root:
p ole.dirents.length # => 5
puts ole.root.to_tree
# =>
- #
|- #
|- #
|- #
\- #
You can access (through RangesIO methods, or by using the
relevant Dirent and AllocationTable methods) information like where within
the container a stream is located (these are offset/length pairs):
p ole.root["\001CompObj"].open { |io| io.ranges } # => [[0, 64], [64, 34]]
See the documentation for each class for more details.
= Thanks
* The code contained in this project was initially based on chicago's libole
(source available at http://prdownloads.sf.net/chicago/ole.tgz).
* It was later augmented with some corrections by inspecting pole, and (purely
for header definitions) gsf.
* The property set parsing code came from the apache java project POIFS.
* The excellent idea for using a pseudo file system style interface by providing
#file and #dir methods which mimic File and Dir, was borrowed (along with almost
unchanged tests!) from Thomas Sondergaard's rubyzip.
= TODO
== 1.2.12
* internal api cleanup
* add buffering to rangesio so that performance for small reads and writes
isn't so awful. maybe try and remove the bottlenecks of unbuffered first
with more profiling, then implement the buffering on top of that.
* fix mode strings - like truncate when using 'w+', supporting append
'a+' modes etc. done?
* make ranges io obey readable vs writeable modes.
* more RangesIO completion. ie, doesn't support #<< at the moment.
* maybe some oletool doc.
* make sure `rake test' runs tests both with $KCODE='UTF8', and without,
and maybe ensure i don't regress on 1.9 and jruby either now that they're
fixed.
== 1.3.1
* case insensitive open mode would be nice
* fix property sets a bit more. see TODO in Ole::Storage::MetaData
* ability to zero out padding and unused blocks
* better tests for mbat support.
* further doc cleanup
* add in place testing for jruby and ruby1.9
== Longer term
* more benchmarking, profiling, and speed fixes. was thinking vs other
ruby filesystems (eg, vs File/Dir itself, and vs rubyzip), and vs other
ole implementations (maybe perl's, and poifs) just to check its in the
ballpark, with no remaining silly bottlenecks.
* supposedly vba does something weird to ole files. test that.
ruby-ole-1.2.11.7/test/ 0000755 0000041 0000041 00000000000 12177737361 014524 5 ustar www-data www-data ruby-ole-1.2.11.7/test/test_support.rb 0000755 0000041 0000041 00000006261 12177737361 017634 0 ustar www-data www-data #! /usr/bin/ruby
$: << File.dirname(__FILE__) + '/../lib'
require 'test/unit'
require 'ole/support'
class TestSupport < Test::Unit::TestCase
TEST_DIR = File.dirname __FILE__
def test_file
assert_equal 4096, open("#{TEST_DIR}/oleWithDirs.ole") { |f| f.size }
# point is to have same interface as:
assert_equal 4096, StringIO.open(open("#{TEST_DIR}/oleWithDirs.ole", 'rb', &:read)).size
end
def test_enumerable
expect = {0 => [2, 4], 1 => [1, 3]}
assert_equal expect, [1, 2, 3, 4].group_by { |i| i & 1 }
assert_equal 10, [1, 2, 3, 4].sum
assert_equal %w[1 2 3 4], [1, 2, 3, 4].map(&:to_s)
end
def test_logger
io = StringIO.new
log = Logger.new_with_callstack io
log.warn 'test'
expect = %r{^\[\d\d:\d\d:\d\d .*?test_support\.rb:\d+:test_logger\]\nWARN test$}
assert_match expect, io.string.chomp
end
def test_io
str = 'a' * 5000 + 'b'
src, dst = StringIO.new(str), StringIO.new
IO.copy src, dst
assert_equal str, dst.string
end
def test_symbol
array = (1..10).to_a
assert_equal 55, array.inject(&:+)
end
end
class TestIOMode < Test::Unit::TestCase
def mode s
Ole::IOMode.new s
end
def test_parse
assert_equal true, mode('r+bbbbb').binary?
assert_equal false, mode('r+').binary?
assert_equal false, mode('r+').create?
assert_equal false, mode('r').create?
assert_equal true, mode('wb').create?
assert_equal true, mode('w').truncate?
assert_equal false, mode('r').truncate?
assert_equal false, mode('r+').truncate?
assert_equal true, mode('r+').readable?
assert_equal true, mode('r+').writeable?
assert_equal false, mode('r').writeable?
assert_equal false, mode('w').readable?
assert_equal true, mode('a').append?
assert_equal false, mode('w+').append?
end
def test_invalid
assert_raises(ArgumentError) { mode 'rba' }
assert_raises(ArgumentError) { mode '+r' }
end
def test_inspect
assert_equal '#', mode('r').inspect
assert_equal '#', mode('wb+').inspect
assert_equal '#', mode('a').inspect
end
end
class TestRecursivelyEnumerable < Test::Unit::TestCase
class Container
include RecursivelyEnumerable
def initialize *children
@children = children
end
def each_child(&block)
@children.each(&block)
end
def inspect
"#"
end
end
def setup
@root = Container.new(
Container.new(1),
Container.new(2,
Container.new(
Container.new(3)
)
),
4,
Container.new()
)
end
def test_find
i = 0
found = @root.recursive.find do |obj|
i += 1
obj == 4
end
assert_equal found, 4
assert_equal 9, i
i = 0
found = @root.recursive(:breadth_first).find do |obj|
i += 1
obj == 4
end
assert_equal found, 4
assert_equal 4, i
# this is to make sure we hit the breadth first child cache
i = 0
found = @root.recursive(:breadth_first).find do |obj|
i += 1
obj == 3
end
assert_equal found, 3
assert_equal 10, i
end
def test_to_tree
assert_equal <<-'end', @root.to_tree
- #
|- #
| \- 1
|- #
| |- 2
| \- #
| \- #
| \- 3
|- 4
\- #
end
end
end
ruby-ole-1.2.11.7/test/test_word_95.doc 0000644 0000041 0000041 00000170000 12177737361 017540 0 ustar www-data www-data аЯрЁБс ; ўџ u ўџџџ џџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџџ§џџџџџџџўџџџp
! " # $ % &