html2text-0.2.0/ 0000755 0001750 0001750 00000000000 13100046034 014721 5 ustar balasankarc balasankarc html2text-0.2.0/README.md 0000644 0001750 0001750 00000003040 13100046034 016175 0 ustar balasankarc balasankarc html2text [](https://travis-ci.org/soundasleep/html2text_ruby)
==============
`html2text` is a very simple script that uses Ruby's DOM methods to load HTML from a string, and then iterates over the resulting DOM to correctly output plain text. For example:
```html
Ignored Title
Hello, World!
This is some e-mail content.
Even though it has whitespace and newlines, the e-mail converter
will handle it correctly.
Even mismatched tags.
A div
Another div
A link
```
Will be converted into:
```text
Hello, World!
This is some e-mail content. Even though it has whitespace and newlines, the e-mail converter will handle it correctly.
Even mismatched tags.
A div
Another div
A div
within a div
[A link](http://foo.com)
```
See the [original blog post](http://journals.jevon.org/users/jevon-phd/entry/19818) or the related [StackOverflow answer](http://stackoverflow.com/a/2564472/39531).
## Installing
TODO Install the gem, then you can:
```ruby
require 'html2text'
text = Html2Text.convert(html)
```
## Tests
See all of the test cases defined in [spec/examples/](spec/examples/). These can be run with:
```
bundle install
rspec
```
## License
`html2text` is licensed under MIT.
## Other versions
Also see [html2text](https://github.com/soundasleep/html2text), the original PHP implementation.
html2text-0.2.0/spec/ 0000755 0001750 0001750 00000000000 13100046034 015653 5 ustar balasankarc balasankarc html2text-0.2.0/spec/examples_spec.rb 0000644 0001750 0001750 00000001354 13100046034 021033 0 ustar balasankarc balasankarc require "spec_helper"
describe Html2Text do
describe "#convert" do
let(:text) { Html2Text.convert(html) }
examples = Dir[File.dirname(__FILE__) + "/examples/*.html"]
examples.each do |filename|
context "#{filename}" do
let(:html) { File.read(filename) }
let(:text_file) { filename.sub(".html", ".txt") }
let(:expected) { Html2Text.fix_newlines(File.read(text_file)) }
it "has an expected output" do
expect(File.exist?(text_file)).to eq(true), "'#{text_file}' did not exist"
end
it "converts to text" do
expect(text).to eq(expected)
end
end
end
it "has examples to test" do
expect(examples.size).to_not eq(0)
end
end
end
html2text-0.2.0/spec/html2text_spec.rb 0000644 0001750 0001750 00000001454 13100046034 021151 0 ustar balasankarc balasankarc require "spec_helper"
describe Html2Text do
describe "#convert" do
let(:text) { Html2Text.convert(html) }
context "an empty line" do
let(:html) { "" }
it "is an empty line" do
expect(text).to eq("")
end
end
context "a simple string" do
let(:html) { "hello world" }
it "is an empty line" do
expect(text).to eq("hello world")
end
end
end
describe "#remove_leading_and_trailing_whitespace" do
let(:subject) { Html2Text.new(nil).remove_leading_and_trailing_whitespace(input) }
context "an empty string" do
let(:input) { "" }
it { is_expected.to eq("") }
end
context "many new lines" do
let(:input) { "hello\n world \n yes" }
it { is_expected.to eq("hello\nworld\nyes") }
end
end
end
html2text-0.2.0/spec/spec_helper.rb 0000644 0001750 0001750 00000000171 13100046034 020470 0 ustar balasankarc balasankarc require "rspec"
require "rspec/collection_matchers"
require File.join(File.dirname(__FILE__), "..", "lib", "html2text")
html2text-0.2.0/spec/examples/ 0000755 0001750 0001750 00000000000 13100046034 017471 5 ustar balasankarc balasankarc html2text-0.2.0/spec/examples/basic.html 0000644 0001750 0001750 00000000642 13100046034 021442 0 ustar balasankarc balasankarc
Ignored Title
Hello, World!
This is some e-mail content.
Even though it has whitespace and newlines, the e-mail converter
will handle it correctly.
Even mismatched tags.
A div
Another div
Another line
Yet another line
A link
html2text-0.2.0/spec/examples/more-anchors.txt 0000644 0001750 0001750 00000000443 13100046034 022630 0 ustar balasankarc balasankarc Anchor tests
Visit http://openiaml.org or openiaml.org or http://openiaml.org.
To visit with SSL, visit https://openiaml.org or openiaml.org or https://openiaml.org.
To mail, email support@openiaml.org or mailto:support@openiaml.org or support@openiaml.org or mailto:support@openiaml.org. html2text-0.2.0/spec/examples/table.html 0000644 0001750 0001750 00000001414 13100046034 021446 0 ustar balasankarc balasankarc
Ignored Title
Hello, World!
Col A |
Col B |
Data A1
|
Data B1
|
Data A2
|
Data B2
|
Data A3
|
Data B4
|
Total A
|
Total B
|