pax_global_header 0000666 0000000 0000000 00000000064 14137461156 0014522 g ustar 00root root 0000000 0000000 52 comment=8da4375c758a30d174ce9a44147a1b5f36e5029b reverse_markdown-2.1.1/ 0000775 0000000 0000000 00000000000 14137461156 0015100 5 ustar 00root root 0000000 0000000 reverse_markdown-2.1.1/.gitignore 0000664 0000000 0000000 00000000140 14137461156 0017063 0 ustar 00root root 0000000 0000000 *.gem .bundle .rvmrc .ruby-version .ruby-gemset .codeclimate Gemfile.lock pkg/* coverage/* TODO reverse_markdown-2.1.1/.rspec 0000664 0000000 0000000 00000000010 14137461156 0016204 0 ustar 00root root 0000000 0000000 --color reverse_markdown-2.1.1/.travis.yml 0000664 0000000 0000000 00000000267 14137461156 0017216 0 ustar 00root root 0000000 0000000 language: ruby cache: bundler rvm: - 2.0 - 2.1 - 2.2 - 2.3 - 2.4 - 2.5 - 2.6 - 2.7 - jruby-9.2.8.0 notifications: disabled: false recipients: - xijo@pm.me reverse_markdown-2.1.1/CHANGELOG.md 0000664 0000000 0000000 00000006654 14137461156 0016724 0 ustar 00root root 0000000 0000000 # Change Log All notable changes to this project will be documented in this file. ## 2.1.1 - October 2021 - Fixes unintentional newline characters within lists with paragraphs, thanks @diogoosorio, see #93 - Lets \n to be present in
tag. solves #77 #78, thanks @shivabhusal ## 2.1.0 - May 2020 - Add support for `figure` tags, see #86, thanks @anshul78 ## 2.0.0 - March 2020 - BREAKING: Dropped support for ruby 1.9.3 - Add support for `details` and `summary` tags, see #85 ## 1.4.0 – January 2020 - BREAKING: jump links will no longer be ignored but treated as links, see #82 ## 1.3.0 - September 2019 - Add support for `s` HTML tag, thanks @fauno ## 1.2.0 - August 2019 - Handle windows `\r\n` within text blocks, thanks for reporting @krisdigital - Handle paragraphs in `li` tags, thanks @gstamp ## 1.1.0 - April 2018 - Support Jruby, thanks @grddev (#71) - Bypass `` tags, thanks @mu-is-too-short (#70) ## 1.0.5 - February 2018 - Fix newline handling within pre tags, thanks @niallcolfer (#69) ## 1.0.4 - November 2017 - Make blockquote behave as true block, thanks for reporting @kanedo (#67) ## 1.0.3 - Apr 2016 ### Changes - Use tag_border option while cleaning up, thanks @AlexanderPruss (#66) ## 1.0.2 - Apr 2016 ### Changes - Handle edge case: exclamation mark before links, thanks @Easy-D (#57) ## 1.0.1 - Jan 2016 ### Changes - Prevent double escaping of * and _, thanks @craig-day (#61) ## 1.0.0 - Nov 2015 ### Changes - BREAKING: Parsing was significantly improved, thanks @craig-day (#60) Please update your custom converters to accept and use the state hash, for examples look into exisiting standard converters. - Use OptionParser for command line options, thanks @grmartin (#55) - Tag border behavior is now configurable with the `tag_border` option, thanks @faheemmughal (#59) - Preserve > and < from original markup, thanks @willglynn (#58) ## 0.8.2 - May 2015 ### Changes - Don't add whitespaces in links and images if they contain underscores ## 0.8.1 - April 2015 ### Changes - Don't add newlines after nested lists ## 0.8.0 - April 2015 ### Added - `article` tag is now supported and treated like a div ### Changed - Special characters are treated correctly inside of backticks, see (#47) ## 0.7.0 - February 2015 ### Added - pre-tags support syntax github and confluence syntax highlighting now ## 0.6.1 - January 2015 ### Changed - Setting config options in block style will last for all following `convert` calls. - Inline config options are only applied to this particular operation ### Removed - `config.reset` is removed ## 0.6.0 - September 2014 ### Added - Ignore `col` and `colgroup` tags - Bypass `thead` and `tbody` tags to show the tables correctly ### Changed - Eliminate ruby warnings on load (thx @vsipuli) - Treat newlines within text nodes as space - Remove whitespace between inline tags and punctuation characters ## 0.5.1 - April 2014 ### Added - Adds support for ruby versions 1.9.3 back in - More options for handling of unknown tags ### Changed - Bugfixes in `li` indentation behavior ## 0.5.0 - March 2014 **There were some breaking changes, please make sure you don't miss them:** 1. Only ruby versions 2.0.0 or above are supported 2. There is no `Mapper` class any more. Just use `ReverseMarkdown.convert(input, options)` 3. Config option `github_style_code_blocks` changed its name to `github_flavored` Please open an issue and let me know about it if you have any trouble with the new version. reverse_markdown-2.1.1/Gemfile 0000664 0000000 0000000 00000000144 14137461156 0016372 0 ustar 00root root 0000000 0000000 source "http://rubygems.org" # Specify your gem's dependencies in reverse_markdown.gemspec gemspec reverse_markdown-2.1.1/LICENSE 0000664 0000000 0000000 00000000742 14137461156 0016110 0 ustar 00root root 0000000 0000000 DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE Version 2, December 2004 Copyright (C) 2014 Johannes OpperEveryone is permitted to copy and distribute verbatim or modified copies of this license document, and changing it is allowed as long as the name is changed. DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. You just DO WHAT THE FUCK YOU WANT TO. reverse_markdown-2.1.1/README.md 0000664 0000000 0000000 00000010356 14137461156 0016364 0 ustar 00root root 0000000 0000000 # Summary Transform html into markdown. Useful for example if you want to import html into your markdown based application. [](https://travis-ci.org/xijo/reverse_markdown) [](http://badge.fury.io/rb/reverse_markdown) [](https://codeclimate.com/github/xijo/reverse_markdown) [](https://codeclimate.com/github/xijo/reverse_markdown) ## Changelog See [Change Log](CHANGELOG.md) ## Requirements 1. [Nokogiri](http://nokogiri.org/) 2. Ruby 2.0.0 or higher ## Installation Install the gem ```sh [sudo] gem install reverse_markdown ``` or add it to your Gemfile ```ruby gem 'reverse_markdown' ``` ## Features - Supports all the established html tags like `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `p`, `em`, `strong`, `i`, `b`, `blockquote`, `code`, `img`, `a`, `hr`, `li`, `ol`, `ul`, `table`, `tr`, `th`, `td`, `br`, `figure` - Module based - if you miss a tag, just add it - Can deal with nested lists - Inline and block code is supported - Supports blockquote # Usage ## Ruby You can convert html content as string or Nokogiri document: ```ruby input = 'feelings' result = ReverseMarkdown.convert input result.inspect # " **feelings** " ```` ## Commandline It's also possible to convert html files to markdown using the binary: ```sh $ reverse_markdown file.html > file.md $ cat file.html | reverse_markdown > file.md ```` ## Configuration The following options are available: - `unknown_tags` (default `pass_through`) - how to handle unknown tags. Valid options are: - `pass_through` - Include the unknown tag completely into the result - `drop` - Drop the unknown tag and its content - `bypass` - Ignore the unknown tag but try to convert its content - `raise` - Raise an error to let you know - `github_flavored` (default `false`) - use [github flavored markdown](https://help.github.com/articles/github-flavored-markdown) (yet only code blocks are supported) - `tag_border` (default `' '`) - how to handle tag borders. valid options are: - `' '` - Add whitespace if there is none at tag borders. - `''` - Do not not add whitespace. ### As options Just pass your chosen configuration options in after the input. The given options will last for this operation only. ```ruby ReverseMarkdown.convert(input, unknown_tags: :raise, github_flavored: true) ``` ### Preconfigure Or configure it block style on a initializer level. These configurations will last for all conversions until they are set to something different. ```ruby ReverseMarkdown.config do |config| config.unknown_tags = :bypass config.github_flavored = true config.tag_border = '' end ``` # Related stuff - [Write custom converters](https://github.com/xijo/reverse_markdown/wiki/Write-your-own-converter) - Wiki entry about how to write your own converter - [html_massage](https://github.com/harlantwood/html_massage) - A gem by Harlan T. Wood to convert regular sites into markdown using reverse_markdown - [word-to-markdown](https://github.com/benbalter/word-to-markdown) - Convert word docs into markdown while using reverse_markdown, by Ben Balter - [markdown syntax](http://daringfireball.net/projects/markdown) - The markdown syntax specification - [github flavored markdown](https://help.github.com/articles/github-flavored-markdown) - Githubs extension to markdown - [wmd-editor](http://wmd-editor.com) - Markdown flavored text editor # Thanks Thanks to all [contributors](https://github.com/xijo/reverse_markdown/graphs/contributors) and all other helpers: - [Empact](https://github.com/Empact) Ben Woosley - [harlantwood](https://github.com/harlantwood) Harlan T. Wood - [aprescott](https://github.com/aprescott) Adam Prescott - [danschultzer](https://github.com/danschultzer) Dan Schultzer - [Benjamin-Dobell](https://github.com/Benjamin-Dobell) Benjamin Dobell - [schkovich](https://github.com/schkovich) Goran Miskovic - [craig-day](https://github.com/craig-day) Craig Day - [grmartin](https://github.com/grmartin) Glenn R. Martin - [willglynn](https://github.com/willglynn) Will Glynn reverse_markdown-2.1.1/Rakefile 0000664 0000000 0000000 00000000520 14137461156 0016542 0 ustar 00root root 0000000 0000000 require 'bundler/gem_tasks' if File.exist?('.codeclimate') ENV["CODECLIMATE_REPO_TOKEN"] = File.read('.codeclimate').strip end require 'rspec/core/rake_task' RSpec::Core::RakeTask.new(:spec) task :default => :spec desc 'Open an irb session preloaded with this library' task :console do sh 'irb -I lib -r reverse_markdown.rb' end reverse_markdown-2.1.1/bin/ 0000775 0000000 0000000 00000000000 14137461156 0015650 5 ustar 00root root 0000000 0000000 reverse_markdown-2.1.1/bin/reverse_markdown 0000775 0000000 0000000 00000001130 14137461156 0021146 0 ustar 00root root 0000000 0000000 #!/usr/bin/env ruby # Usage: reverse_markdown [FILE]... # Usage: cat FILE | reverse_markdown require 'reverse_markdown' require 'optparse' options = {} OptionParser.new do |opts| opts.banner = "Usage: reverse_markdown [options] " opts.on('-u', '--unknown_tags [pass_through, drop, bypass, raise]', 'Unknown tag handling (default: pass_through)') { |v| ReverseMarkdown.config.unknown_tags = v } opts.on('-g', '--github_flavored bool', 'use github flavored markdown (default: false)') { |v| ReverseMarkdown.config.github_flavored = v } end.parse! puts ReverseMarkdown.convert(ARGF.read) reverse_markdown-2.1.1/lib/ 0000775 0000000 0000000 00000000000 14137461156 0015646 5 ustar 00root root 0000000 0000000 reverse_markdown-2.1.1/lib/reverse_markdown.rb 0000664 0000000 0000000 00000003730 14137461156 0021553 0 ustar 00root root 0000000 0000000 require 'nokogiri' require 'reverse_markdown/version' require 'reverse_markdown/errors' require 'reverse_markdown/cleaner' require 'reverse_markdown/config' require 'reverse_markdown/converters' require 'reverse_markdown/converters/base' require 'reverse_markdown/converters/a' require 'reverse_markdown/converters/blockquote' require 'reverse_markdown/converters/br' require 'reverse_markdown/converters/bypass' require 'reverse_markdown/converters/code' require 'reverse_markdown/converters/del' require 'reverse_markdown/converters/div' require 'reverse_markdown/converters/drop' require 'reverse_markdown/converters/details' require 'reverse_markdown/converters/em' require 'reverse_markdown/converters/figcaption' require 'reverse_markdown/converters/figure' require 'reverse_markdown/converters/h' require 'reverse_markdown/converters/hr' require 'reverse_markdown/converters/ignore' require 'reverse_markdown/converters/img' require 'reverse_markdown/converters/li' require 'reverse_markdown/converters/ol' require 'reverse_markdown/converters/p' require 'reverse_markdown/converters/pass_through' require 'reverse_markdown/converters/pre' require 'reverse_markdown/converters/strong' require 'reverse_markdown/converters/table' require 'reverse_markdown/converters/td' require 'reverse_markdown/converters/text' require 'reverse_markdown/converters/tr' module ReverseMarkdown def self.convert(input, options = {}) config.with(options) do input = cleaner.force_encoding(input.to_s) root = case input when String then Nokogiri::HTML(input).root when Nokogiri::XML::Document then input.root when Nokogiri::XML::Node then input end root or return '' result = ReverseMarkdown::Converters.lookup(root.name).convert(root) cleaner.tidy(result) end end def self.config @config ||= Config.new yield @config if block_given? @config end def self.cleaner @cleaner ||= Cleaner.new end end reverse_markdown-2.1.1/lib/reverse_markdown/ 0000775 0000000 0000000 00000000000 14137461156 0021223 5 ustar 00root root 0000000 0000000 reverse_markdown-2.1.1/lib/reverse_markdown/cleaner.rb 0000664 0000000 0000000 00000005373 14137461156 0023171 0 ustar 00root root 0000000 0000000 module ReverseMarkdown class Cleaner def tidy(string) result = remove_inner_whitespaces(string) result = remove_newlines(result) result = remove_leading_newlines(result) result = clean_tag_borders(result) clean_punctuation_characters(result) end def remove_newlines(string) string.gsub(/\n{3,}/, "\n\n") end def remove_leading_newlines(string) string.gsub(/\A\n+/, '') end def remove_inner_whitespaces(string) string.each_line.inject("") do |memo, line| memo + preserve_border_whitespaces(line) do line.strip.gsub(/[ \t]{2,}/, ' ') end end end # Find non-asterisk content that is enclosed by two or # more asterisks. Ensure that only one whitespace occurs # in the border area. # Same for underscores and brackets. def clean_tag_borders(string) result = string.gsub(/\s?\*{2,}.*?\*{2,}\s?/) do |match| preserve_border_whitespaces(match, default_border: ReverseMarkdown.config.tag_border) do match.strip.sub('** ', '**').sub(' **', '**') end end result = result.gsub(/\s?\_{2,}.*?\_{2,}\s?/) do |match| preserve_border_whitespaces(match, default_border: ReverseMarkdown.config.tag_border) do match.strip.sub('__ ', '__').sub(' __', '__') end end result = result.gsub(/\s?~{2,}.*?~{2,}\s?/) do |match| preserve_border_whitespaces(match, default_border: ReverseMarkdown.config.tag_border) do match.strip.sub('~~ ', '~~').sub(' ~~', '~~') end end result.gsub(/\s?\[.*?\]\s?/) do |match| preserve_border_whitespaces(match) do match.strip.sub('[ ', '[').sub(' ]', ']') end end end def clean_punctuation_characters(string) string.gsub(/(\*\*|~~|__)\s([\.!\?'"])/, "\\1".strip + "\\2") end def force_encoding(string) ReverseMarkdown.config.force_encoding or return string string.encode('UTF-8', 'binary', invalid: :replace, undef: :replace, replace: '') end private def preserve_border_whitespaces(string, options = {}, &block) return string if string =~ /\A\s*\Z/ default_border = options.fetch(:default_border, '') # If the string contains part of a link so the characters [,],(,) # then don't add any extra spaces default_border = '' if string =~ /[\[\(\]\)]/ string_start = present_or_default(string[/\A\s*/], default_border) string_end = present_or_default(string[/\s*\Z/], default_border) result = yield string_start + result + string_end end def present_or_default(string, default) if string.nil? || string.empty? default else string end end end end reverse_markdown-2.1.1/lib/reverse_markdown/config.rb 0000664 0000000 0000000 00000001551 14137461156 0023017 0 ustar 00root root 0000000 0000000 module ReverseMarkdown class Config attr_writer :unknown_tags, :github_flavored, :tag_border, :force_encoding def initialize @unknown_tags = :pass_through @github_flavored = false @force_encoding = false @em_delimiter = '_'.freeze @strong_delimiter = '**'.freeze @inline_options = {} @tag_border = ' '.freeze end def with(options = {}) @inline_options = options result = yield @inline_options = {} result end def unknown_tags @inline_options[:unknown_tags] || @unknown_tags end def github_flavored @inline_options[:github_flavored] || @github_flavored end def tag_border @inline_options[:tag_border] || @tag_border end def force_encoding @inline_options[:force_encoding] || @force_encoding end end end reverse_markdown-2.1.1/lib/reverse_markdown/converters.rb 0000664 0000000 0000000 00000001650 14137461156 0023744 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters def self.register(tag_name, converter) @@converters ||= {} @@converters[tag_name.to_sym] = converter end def self.unregister(tag_name) @@converters.delete(tag_name.to_sym) end def self.lookup(tag_name) @@converters[tag_name.to_sym] or default_converter(tag_name) end private def self.default_converter(tag_name) case ReverseMarkdown.config.unknown_tags.to_sym when :pass_through ReverseMarkdown::Converters::PassThrough.new when :drop ReverseMarkdown::Converters::Drop.new when :bypass ReverseMarkdown::Converters::Bypass.new when :raise raise UnknownTagError, "unknown tag: #{tag_name}" else raise InvalidConfigurationError, "unknown value #{ReverseMarkdown.config.unknown_tags.inspect} for ReverseMarkdown.config.unknown_tags" end end end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/ 0000775 0000000 0000000 00000000000 14137461156 0023415 5 ustar 00root root 0000000 0000000 reverse_markdown-2.1.1/lib/reverse_markdown/converters/a.rb 0000664 0000000 0000000 00000001103 14137461156 0024155 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class A < Base def convert(node, state = {}) name = treat_children(node, state) href = node['href'] title = extract_title(node) if href.to_s.empty? || name.empty? name else link = "[#{name}](#{href}#{title})" link.prepend(' ') if prepend_space?(node) link end end private def prepend_space?(node) node.at_xpath("preceding::text()[1]").to_s.end_with?('!') end end register :a, A.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/base.rb 0000664 0000000 0000000 00000001105 14137461156 0024651 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Base def treat_children(node, state) node.children.inject('') do |memo, child| memo << treat(child, state) end end def treat(node, state) ReverseMarkdown::Converters.lookup(node.name).convert(node, state) end def escape_keychars(string) string.gsub(/(? '\*', '_' => '\_') end def extract_title(node) title = escape_keychars(node['title'].to_s) title.empty? ? '' : %[ "#{title}"] end end end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/blockquote.rb 0000664 0000000 0000000 00000000544 14137461156 0026115 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Blockquote < Base def convert(node, state = {}) content = treat_children(node, state).strip content = ReverseMarkdown.cleaner.remove_newlines(content) "\n\n> " << content.lines.to_a.join('> ') << "\n\n" end end register :blockquote, Blockquote.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/br.rb 0000664 0000000 0000000 00000000250 14137461156 0024342 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Br < Base def convert(node, state = {}) " \n" end end register :br, Br.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/bypass.rb 0000664 0000000 0000000 00000000635 14137461156 0025247 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Bypass < Base def convert(node, state = {}) treat_children(node, state) end end register :document, Bypass.new register :html, Bypass.new register :body, Bypass.new register :span, Bypass.new register :thead, Bypass.new register :tbody, Bypass.new register :tfoot, Bypass.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/code.rb 0000664 0000000 0000000 00000000270 14137461156 0024653 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Code < Base def convert(node, state = {}) "`#{node.text}`" end end register :code, Code.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/del.rb 0000664 0000000 0000000 00000001072 14137461156 0024506 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Del < Base def convert(node, state = {}) content = treat_children(node, state.merge(already_crossed_out: true)) if disabled? || content.strip.empty? || state[:already_crossed_out] content else "~~#{content}~~" end end def enabled? ReverseMarkdown.config.github_flavored end def disabled? !enabled? end end register :strike, Del.new register :s, Del.new register :del, Del.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/details.rb 0000664 0000000 0000000 00000001113 14137461156 0025363 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Details < Base def convert(node, state = {}) content = treat_children(node, state.merge(already_processed: true)) if disabled? || content.strip.empty? || state[:already_processed] content else "##{content}" end end def enabled? ReverseMarkdown.config.github_flavored end def disabled? !enabled? end end register :details, Details.new register :summary, Details.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/div.rb 0000664 0000000 0000000 00000000363 14137461156 0024526 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Div < Base def convert(node, state = {}) "\n" << treat_children(node, state) << "\n" end end register :div, Div.new register :article, Div.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/drop.rb 0000664 0000000 0000000 00000000214 14137461156 0024703 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Drop < Base def convert(node, state = {}) '' end end end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/em.rb 0000664 0000000 0000000 00000000644 14137461156 0024347 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Em < Base def convert(node, state = {}) content = treat_children(node, state.merge(already_italic: true)) if content.strip.empty? || state[:already_italic] content else "#{content[/^\s*/]}_#{content.strip}_#{content[/\s*$/]}" end end end register :em, Em.new register :i, Em.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/figcaption.rb 0000664 0000000 0000000 00000000452 14137461156 0026066 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class FigCaption < Base def convert(node, state = {}) if node.text.strip.empty? "" else "\n" << "_#{node.text.strip}_" << "\n" end end end register :figcaption, FigCaption.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/figure.rb 0000664 0000000 0000000 00000000362 14137461156 0025224 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Figure < Base def convert(node, state = {}) content = treat_children(node, state) "\n#{content.strip}\n" end end register :figure, Figure.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/h.rb 0000664 0000000 0000000 00000000577 14137461156 0024202 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class H < Base def convert(node, state = {}) prefix = '#' * node.name[/\d/].to_i ["\n", prefix, ' ', treat_children(node, state), "\n"].join end end register :h1, H.new register :h2, H.new register :h3, H.new register :h4, H.new register :h5, H.new register :h6, H.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/hr.rb 0000664 0000000 0000000 00000000255 14137461156 0024355 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Hr < Base def convert(node, state = {}) "\n* * *\n" end end register :hr, Hr.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/ignore.rb 0000664 0000000 0000000 00000000377 14137461156 0025234 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Ignore < Base def convert(node, state = {}) '' # noop end end register :colgroup, Ignore.new register :col, Ignore.new register :head, Ignore.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/img.rb 0000664 0000000 0000000 00000000435 14137461156 0024520 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Img < Base def convert(node, state = {}) alt = node['alt'] src = node['src'] title = extract_title(node) " " end end register :img, Img.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/li.rb 0000664 0000000 0000000 00000001703 14137461156 0024347 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Li < Base def convert(node, state = {}) contains_child_paragraph = node.first_element_child ? node.first_element_child.name == 'p' : false content_node = contains_child_paragraph ? node.first_element_child : node content = treat_children(content_node, state) indentation = indentation_from(state) prefix = prefix_for(node) "#{indentation}#{prefix}#{content.chomp}\n" + (contains_child_paragraph ? "\n" : '') end def prefix_for(node) if node.parent.name == 'ol' index = node.parent.xpath('li').index(node) "#{index.to_i + 1}. " else '- ' end end def indentation_from(state) length = state.fetch(:ol_count, 0) ' ' * [length - 1, 0].max end end register :li, Li.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/ol.rb 0000664 0000000 0000000 00000000451 14137461156 0024354 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Ol < Base def convert(node, state = {}) ol_count = state.fetch(:ol_count, 0) + 1 "\n" << treat_children(node, state.merge(ol_count: ol_count)) end end register :ol, Ol.new register :ul, Ol.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/p.rb 0000664 0000000 0000000 00000000324 14137461156 0024200 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class P < Base def convert(node, state = {}) "\n\n" << treat_children(node, state).strip << "\n\n" end end register :p, P.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/pass_through.rb 0000664 0000000 0000000 00000000232 14137461156 0026445 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class PassThrough < Base def convert(node, state = {}) node.to_s end end end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/pre.rb 0000664 0000000 0000000 00000002062 14137461156 0024530 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Pre < Base def convert(node, state = {}) content = treat_children(node, state) if ReverseMarkdown.config.github_flavored "\n```#{language(node)}\n" << content.strip << "\n```\n" else "\n\n " << content.lines.to_a.join(" ") << "\n\n" end end private # Override #treat as proposed in https://github.com/xijo/reverse_markdown/pull/69 def treat(node, state) case node.name when 'code', 'text' node.text.strip when 'br' "\n" else super end end def language(node) lang = language_from_highlight_class(node) lang || language_from_confluence_class(node) end def language_from_highlight_class(node) node.parent['class'].to_s[/highlight-([a-zA-Z0-9]+)/, 1] end def language_from_confluence_class(node) node['class'].to_s[/brush:\s?(:?.*);/, 1] end end register :pre, Pre.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/strong.rb 0000664 0000000 0000000 00000000672 14137461156 0025263 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Strong < Base def convert(node, state = {}) content = treat_children(node, state.merge(already_strong: true)) if content.strip.empty? || state[:already_strong] content else "#{content[/^\s*/]}**#{content.strip}**#{content[/\s*$/]}" end end end register :strong, Strong.new register :b, Strong.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/table.rb 0000664 0000000 0000000 00000000330 14137461156 0025025 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Table < Base def convert(node, state = {}) "\n\n" << treat_children(node, state) << "\n" end end register :table, Table.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/td.rb 0000664 0000000 0000000 00000000370 14137461156 0024351 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Td < Base def convert(node, state = {}) content = treat_children(node, state) " #{content} |" end end register :td, Td.new register :th, Td.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/text.rb 0000664 0000000 0000000 00000002633 14137461156 0024732 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Text < Base def convert(node, options = {}) if node.text.strip.empty? treat_empty(node) else treat_text(node) end end private def treat_empty(node) parent = node.parent.name.to_sym if [:ol, :ul].include?(parent) # Otherwise the identation is broken '' elsif node.text == ' ' # Regular whitespace text node ' ' else '' end end def treat_text(node) text = node.text text = preserve_nbsp(text) text = remove_border_newlines(text) text = remove_inner_newlines(text) text = escape_keychars(text) text = preserve_keychars_within_backticks(text) text = preserve_tags(text) text end def preserve_nbsp(text) text.gsub(/\u00A0/, " ") end def preserve_tags(text) text.gsub(/[<>]/, '>' => '\>', '<' => '\<') end def remove_border_newlines(text) text.gsub(/\A\n+/, '').gsub(/\n+\z/, '') end def remove_inner_newlines(text) text.tr("\r\n\t", ' ').squeeze(' ') end def preserve_keychars_within_backticks(text) text.gsub(/`.*?`/) do |match| match.gsub('\_', '_').gsub('\*', '*') end end end register :text, Text.new end end reverse_markdown-2.1.1/lib/reverse_markdown/converters/tr.rb 0000664 0000000 0000000 00000001037 14137461156 0024370 0 ustar 00root root 0000000 0000000 module ReverseMarkdown module Converters class Tr < Base def convert(node, state = {}) content = treat_children(node, state).rstrip result = "|#{content}\n" table_header_row?(node) ? result + underline_for(node) : result end def table_header_row?(node) node.element_children.all? {|child| child.name.to_sym == :th} end def underline_for(node) "| " + (['---'] * node.element_children.size).join(' | ') + " |\n" end end register :tr, Tr.new end end reverse_markdown-2.1.1/lib/reverse_markdown/errors.rb 0000664 0000000 0000000 00000000227 14137461156 0023065 0 ustar 00root root 0000000 0000000 module ReverseMarkdown class Error < StandardError end class UnknownTagError < Error end class InvalidConfigurationError < Error end end reverse_markdown-2.1.1/lib/reverse_markdown/version.rb 0000664 0000000 0000000 00000000057 14137461156 0023237 0 ustar 00root root 0000000 0000000 module ReverseMarkdown VERSION = '2.1.1' end reverse_markdown-2.1.1/reverse_markdown.gemspec 0000664 0000000 0000000 00000002172 14137461156 0022024 0 ustar 00root root 0000000 0000000 # -*- encoding: utf-8 -*- $:.push File.expand_path("../lib", __FILE__) require "reverse_markdown/version" Gem::Specification.new do |s| s.name = "reverse_markdown" s.version = ReverseMarkdown::VERSION s.authors = ["Johannes Opper"] s.email = ["johannes.opper@gmail.com"] s.homepage = "http://github.com/xijo/reverse_markdown" s.summary = %q{Convert html code into markdown.} s.description = %q{Map simple html back into markdown, e.g. if you want to import existing html data in your application.} s.licenses = ["WTFPL"] s.files = `git ls-files`.split("\n") s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n") s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) } s.require_paths = ["lib"] # specify any dependencies here; for example: s.add_dependency 'nokogiri' s.add_development_dependency 'rspec' s.add_development_dependency 'simplecov' s.add_development_dependency 'rake' s.add_development_dependency 'kramdown' s.add_development_dependency 'byebug' s.add_development_dependency 'codeclimate-test-reporter' end reverse_markdown-2.1.1/spec/ 0000775 0000000 0000000 00000000000 14137461156 0016032 5 ustar 00root root 0000000 0000000 reverse_markdown-2.1.1/spec/assets/ 0000775 0000000 0000000 00000000000 14137461156 0017334 5 ustar 00root root 0000000 0000000 reverse_markdown-2.1.1/spec/assets/anchors.html 0000664 0000000 0000000 00000002306 14137461156 0021660 0 ustar 00root root 0000000 0000000 some text... Foobar Fubar Strong foobar There should be no extra space before and after the anchor (stripped). Exception: after an !there should be an extra space. Even with stripped elements inbetween: !there should be an extra space. ignore anchor tags with no link text not ignore anchor tags with images pass through the text of internal jumplinks without treating them as links pass through the text of anchor tags with no href without treating them as links some text...
![]()
![]()
some text... reverse_markdown-2.1.1/spec/assets/basic.html 0000664 0000000 0000000 00000002720 14137461156 0021304 0 ustar 00root root 0000000 0000000 plain text
h1
h2
h3
h4
h5
h6
em tag content before and after empty em tags before and after em tags containing whitespace before
and after em tags containing whitespace double em tagsdouble em tags in p tag
a em with leading and trailing whitespace a em with extra leading and trailing whitespace strong tag content before and after empty strong tags before and after strong tags containing whitespace before
and after strong tags containing whitespace double strong tagsdouble strong tags in p tag
before double strong tags containing whitespace after a strong with leading and trailing whitespace a strong with extra leading and trailing whitespace b tag content i tag content br tags become double space followed by newline
before hr
after hrsection 1section 2reverse_markdown-2.1.1/spec/assets/code.html 0000664 0000000 0000000 00000000562 14137461156 0021137 0 ustar 00root root 0000000 0000000pre blockcode block
pre code block
Paragraph with inline
code
blockCode with indentation:var this; this.is("A multi line code block") console.log("Yup, it is")
reverse_markdown-2.1.1/spec/assets/escapables.html 0000664 0000000 0000000 00000000351 14137461156 0022323 0 ustar 00root root 0000000 0000000 some text... **two asterisks** ***three asterisks*** __two underscores__ ___three underscores___ some text...tell application "Foo" beep end tell
reverse_markdown-2.1.1/spec/assets/from_the_wild.html 0000664 0000000 0000000 00000000477 14137461156 0023054 0 ustar 00root root 0000000 0000000var theoretical_max_infin = 1.0;
.
*** intentcast : logo design
.I\_AM\_HELPFUL reverse_markdown-2.1.1/spec/assets/full_example.html 0000664 0000000 0000000 00000001456 14137461156 0022705 0 ustar 00root root 0000000 0000000
- li 1
- li 2
- li 3
- li 1
- li 2
- li 3
- li 1
- eins
- eins
- eins
- li 1
- li 2
h1
h2
h3
h4
Hallo em Text
strong
Block of code
linkFirst quoted paragraph
Second quoted paragraph
![]()
reverse_markdown-2.1.1/spec/assets/html_fragment.html 0000664 0000000 0000000 00000000057 14137461156 0023053 0 ustar 00root root 0000000 0000000 naked text 1paragraph text
naked text 2 reverse_markdown-2.1.1/spec/assets/lists.html 0000664 0000000 0000000 00000004454 14137461156 0021367 0 ustar 00root root 0000000 0000000some text...
- unordered list entry
- unordered list entry 2
- ordered list entry
- ordered list entry 2
- list entry 1st hierarchy
- nested unsorted list entry
- deep nested list entry
a nested list with no whitespace:
- item a
- item b
- item bb
- item bc
a nested list with lots of whitespace:
- item wa
- item wb
- item wbb
- item wbc
I want to have a party at my house!
I don't want to cleanup after the party!
li 1, p 1
li 1, p 2
li 2, p 1
- one
- one one
- one two
- two
- two one
- two one one
- two one two
- two two
- three
a nested list between adjacent list items
reverse_markdown-2.1.1/spec/assets/minimum.html 0000664 0000000 0000000 00000000042 14137461156 0021671 0 ustar 00root root 0000000 0000000 reverse_markdown-2.1.1/spec/assets/paragraphs.html 0000664 0000000 0000000 00000000621 14137461156 0022351 0 ustar 00root root 0000000 0000000
- alpha
- bravo
- bravo alpha
- bravo bravo
- bravo bravo alpha
- charlie
- delta
First content
Second content
Complex
Content
Trailing whitespace:
Trailing non-breaking space:
Combination: