html2md-0.2.15/.gitignore000064400000000000000000000000231046102023000132430ustar 00000000000000Cargo.lock target/ html2md-0.2.15/.gitlab-ci.yml000064400000000000000000000011031046102023000137070ustar 00000000000000--- variables: GET_SOURCES_ATTEMPTS: "3" GIT_SUBMODULE_STRATEGY: "recursive" stages: - linux build build on default image: image: rust stage: linux build dependencies: [] before_script: &common_compile_debian - apt-get update -qq && apt-get install -y -qq binutils - export CARGO_HOME="${PWD}/.cargo_cache" script: &common_script - cargo build --release after_script: - strip target/release/libhtml2md.so artifacts: &common_artifacts paths: - target/release/libhtml2md.so cache: paths: - target/ - .cargo_cache/ html2md-0.2.15/.vscode/launch.json000064400000000000000000000021721046102023000147700ustar 00000000000000{ // Используйте IntelliSense, чтобы узнать о возможных атрибутах. // Наведите указатель мыши, чтобы просмотреть описания существующих атрибутов. // Для получения дополнительной информации посетите: https://go.microsoft.com/fwlink/?linkid=830387 "version": "0.2.0", "configurations": [ { "name": "Debug unit-tests", "type": "lldb", "request": "launch", "program": "./target/debug/deps/unit-c322eb3e597450f1", "args": [ "test_escaping", "--nocapture" ], "cwd": "${workspaceRoot}", "sourceLanguages": ["rust"] }, { "name": "Debug integration-tests", "type": "lldb", "request": "launch", "program": "./target/debug/deps/integration-2be29df99b0a42dd", "args": [ "test_strong_inside_link", "--nocapture" ], "cwd": "${workspaceRoot}", "sourceLanguages": ["rust"] } ] }html2md-0.2.15/AUTHORS.md000064400000000000000000000001661046102023000127320ustar 00000000000000 Oleg `Kanedias` Chernovskiy - Founder, Maintainer, Primary developer Philipp `Phil` Samoylov - Code review (thanks!)html2md-0.2.15/CONTRIBUTING.md000064400000000000000000000054201046102023000135120ustar 00000000000000How to contribute ----------------- Patches and third-party assistance are essential for this project. I don't have lots of time and simply can't afford testing on specific platforms or in delicate environments. I'll try to keep process of submitting changes as easy as possible, not requiring anything above the usual chaos, heresy and mayhem. Prerequisites ------------- * Make sure you have [GitLab account](https://gitlab.com/users/sign_in#register-pane) * Submit an [issue](https://gitlab.com/Kanedias/html2md/issues/new?issue) for your change. This step is not required, but I highly recommend it. While this can seem redundant, there were numerous situations I hated myself for not doing it. This can be anything: the author of the project can reject patch for not following specific code guidelines that you never saw mentioned, or this can be scripts, tests and lint warnings that you must deal with, or even dead-simple - you can be just unfortunate for submitting your patch prior to big API change or version bump. So... just ask, if you need anything. * [Fork](https://gitlab.com/Kanedias/html2md/forks/new) the repository on GitLab * Create a feature branch from `master` branch in html2md main repo. Avoid working directly on `master` branch - conflicts may arise, you won't be able to update, I may `force push` commits while thinking nobody sees it... etc. * Commit your changes. If you want to be very good person in your eartly life, do it as Linux kernel contributing guide commands - first line is short description, second is empty, third and all rest - full description of changes. Use issue you created previously in the first line as with hash sign for GitLab to be able to link them together. You never know when this may be useful. Like this: `Implement support for tables. Fixes #1` * Create a merge-request from your fork against the main html2md repo. Wait for smoke-build to finish and make sure it passes. Then it's my turn, I'll keep an eye on merge-requests and check them on a regular basis. After some bouncing back and forth around my nitpicking style it will get merged and we all can sleep happily * Great, welcome to the club! What you may wanted to see here but nope -------------------------------------- * No CI yet, I'm planning on adding it. * No strict coding guidelines for now. I personally tend to standard `rustfmt` style and eventually will refactor everything along with adding proper CI style check but now you are free to submit changes with whatever style you wish. * I won't punish or disgrace you if your change breaks something. One cannot possibly [know](https://lkml.org/lkml/2004/12/20/255) [everything](http://catb.org/esr/writings/unix-koans/zealot.html). This project is very niche and every contribution values. html2md-0.2.15/COPYING.md000064400000000000000000000004561046102023000127170ustar 00000000000000Additional permissions granted: * Allowed using this library as MIT-licensed to [`atomicdata-dev/atomic-server`][0] repository * Allowed using this library as MIT-licensed to [`getreu/tp-note`][1] repository [0]: https://github.com/atomicdata-dev/atomic-server [1]: https://gitlab.com/getreu/tp-note html2md-0.2.15/Cargo.lock0000644000000437740000000000100104620ustar # This file is automatically @generated by Cargo. # It is not intended for manual editing. version = 3 [[package]] name = "aho-corasick" version = "0.7.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "58fb5e95d83b38284460a5fda7d6470aa0b8844d283a0b614b8535e880800d2d" dependencies = [ "memchr", ] [[package]] name = "ansi_term" version = "0.12.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d52a9bb7ec0cf484c551830a7ce27bd20d67eac647e1befb56b0be4ee39a55d2" dependencies = [ "winapi", ] [[package]] name = "bitflags" version = "1.0.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "b3c30d3802dfb7281680d6285f2ccdaa8c2d8fee41f93805dba5c4cf50dc23cf" [[package]] name = "bytes" version = "1.0.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "b700ce4376041dcd0a327fd0097c41095743c4c8af8887265942faf1100bd040" [[package]] name = "c2-chacha" version = "0.2.3" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "214238caa1bf3a496ec3392968969cab8549f96ff30652c9e56885329315f6bb" dependencies = [ "ppv-lite86", ] [[package]] name = "cesu8" version = "1.1.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "6d43a04d8753f35258c91f8ec639f792891f748a1edbd759cf1dcea3382ad83c" [[package]] name = "cfg-if" version = "0.1.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "082bb9b28e00d3c9d39cc03e64ce4cea0f1bb9b3fde493f0cbc008472d22bdf4" [[package]] name = "combine" version = "4.5.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "cc4369b5e4c0cddf64ad8981c0111e7df4f7078f4d6ba98fb31f2e17c4c57b7e" dependencies = [ "bytes", "memchr", ] [[package]] name = "ctor" version = "0.1.16" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "7fbaabec2c953050352311293be5c6aba8e141ba19d6811862b232d6fd020484" dependencies = [ "quote", "syn 1.0.72", ] [[package]] name = "diff" version = "0.1.12" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "0e25ea47919b1560c4e3b7fe0aaab9becf5b84a10325ddf7db0f0ba5e1026499" [[package]] name = "fuchsia-zircon" version = "0.3.3" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "2e9763c69ebaae630ba35f74888db465e49e259ba1bc0eda7d06f4a067615d82" dependencies = [ "bitflags", "fuchsia-zircon-sys", ] [[package]] name = "fuchsia-zircon-sys" version = "0.3.3" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "3dcaa9ae7725d12cdb85b3ad99a434db70b468c09ded17e012d86b5c1010f7a7" [[package]] name = "futf" version = "0.1.5" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "df420e2e84819663797d1ec6544b13c5be84629e7bb00dc960d6917db2987843" dependencies = [ "mac", "new_debug_unreachable", ] [[package]] name = "getrandom" version = "0.1.13" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e7db7ca94ed4cd01190ceee0d8a8052f08a247aa1b469a7f68c6a3b71afcf407" dependencies = [ "cfg-if", "libc", "wasi", ] [[package]] name = "html2md" version = "0.2.15" dependencies = [ "html5ever", "indoc", "jni", "lazy_static", "markup5ever_rcdom", "percent-encoding", "pretty_assertions", "regex", "spectral", ] [[package]] name = "html5ever" version = "0.27.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c13771afe0e6e846f1e67d038d4cb29998a6779f93c809212e4e9c32efd244d4" dependencies = [ "log", "mac", "markup5ever", "proc-macro2", "quote", "syn 2.0.68", ] [[package]] name = "indoc" version = "1.0.3" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e5a75aeaaef0ce18b58056d306c27b07436fbb34b8816c53094b76dd81803136" dependencies = [ "unindent", ] [[package]] name = "jni" version = "0.19.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c6df18c2e3db7e453d3c6ac5b3e9d5182664d28788126d39b91f2d1e22b017ec" dependencies = [ "cesu8", "combine", "jni-sys", "log", "thiserror", "walkdir", ] [[package]] name = "jni-sys" version = "0.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "8eaf4bc02d17cbdd7ff4c7438cafcdf7fb9a4613313ad11b4f8fefe7d3fa0130" [[package]] name = "lazy_static" version = "1.5.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "bbd2bcb4c963f2ddae06a2efc7e9f3591312473c50c6685e1f298068316e66fe" [[package]] name = "libc" version = "0.2.138" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "db6d7e329c562c5dfab7a46a2afabc8b987ab9a4834c9d1ca04dc54c1546cef8" [[package]] name = "log" version = "0.4.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c84ec4b527950aa83a329754b01dbe3f58361d1c5efacd1f6d68c494d08a17c6" dependencies = [ "cfg-if", ] [[package]] name = "mac" version = "0.1.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c41e0c4fef86961ac6d6f8a82609f55f31b05e4fce149ac5710e439df7619ba4" [[package]] name = "markup5ever" version = "0.12.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "16ce3abbeba692c8b8441d036ef91aea6df8da2c6b6e21c7e14d3c18e526be45" dependencies = [ "log", "phf", "phf_codegen", "string_cache", "string_cache_codegen", "tendril", ] [[package]] name = "markup5ever_rcdom" version = "0.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "edaa21ab3701bfee5099ade5f7e1f84553fd19228cf332f13cd6e964bf59be18" dependencies = [ "html5ever", "markup5ever", "tendril", "xml5ever", ] [[package]] name = "memchr" version = "2.7.4" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "78ca9ab1a0babb1e7d5695e3530886289c18cf2f87ec19a575a0abdce112e3a3" [[package]] name = "new_debug_unreachable" version = "1.0.3" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "f40f005c60db6e03bae699e414c58bf9aa7ea02a2d0b9bfbcf19286cc4c82b30" [[package]] name = "num" version = "0.1.42" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "4703ad64153382334aa8db57c637364c322d3372e097840c72000dabdcf6156e" dependencies = [ "num-bigint", "num-complex", "num-integer", "num-iter", "num-rational", "num-traits", ] [[package]] name = "num-bigint" version = "0.1.44" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e63899ad0da84ce718c14936262a41cee2c79c981fc0a0e7c7beb47d5a07e8c1" dependencies = [ "num-integer", "num-traits", "rand 0.4.2", "rustc-serialize", ] [[package]] name = "num-complex" version = "0.1.43" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "b288631d7878aaf59442cffd36910ea604ecd7745c36054328595114001c9656" dependencies = [ "num-traits", "rustc-serialize", ] [[package]] name = "num-integer" version = "0.1.39" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e83d528d2677f0518c570baf2b7abdcf0cd2d248860b68507bdcb3e91d4c0cea" dependencies = [ "num-traits", ] [[package]] name = "num-iter" version = "0.1.37" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "af3fdbbc3291a5464dc57b03860ec37ca6bf915ed6ee385e7c6c052c422b2124" dependencies = [ "num-integer", "num-traits", ] [[package]] name = "num-rational" version = "0.1.42" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ee314c74bd753fc86b4780aa9475da469155f3848473a261d2d18e35245a784e" dependencies = [ "num-bigint", "num-integer", "num-traits", "rustc-serialize", ] [[package]] name = "num-traits" version = "0.2.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "0b3a5d7cc97d6d30d8b9bc8fa19bf45349ffe46241e8816f50f62f6d6aaabee1" [[package]] name = "output_vt100" version = "0.1.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "53cdc5b785b7a58c5aad8216b3dfa114df64b0b06ae6e1501cef91df2fbdf8f9" dependencies = [ "winapi", ] [[package]] name = "percent-encoding" version = "2.3.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e3148f5046208a5d56bcfc03053e3ca6334e51da8dfb19b6cdc8b306fae3283e" [[package]] name = "phf" version = "0.11.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ade2d8b8f33c7333b51bcf0428d37e217e9f32192ae4772156f65063b8ce03dc" dependencies = [ "phf_shared 0.11.2", ] [[package]] name = "phf_codegen" version = "0.11.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e8d39688d359e6b34654d328e262234662d16cc0f60ec8dcbe5e718709342a5a" dependencies = [ "phf_generator 0.11.2", "phf_shared 0.11.2", ] [[package]] name = "phf_generator" version = "0.8.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "17367f0cc86f2d25802b2c26ee58a7b23faeccf78a396094c13dced0d0182526" dependencies = [ "phf_shared 0.8.0", "rand 0.7.2", ] [[package]] name = "phf_generator" version = "0.11.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "48e4cc64c2ad9ebe670cb8fd69dd50ae301650392e81c05f9bfcb2d5bdbc24b0" dependencies = [ "phf_shared 0.11.2", "rand 0.8.5", ] [[package]] name = "phf_shared" version = "0.8.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c00cf8b9eafe68dde5e9eaa2cef8ee84a9336a47d566ec55ca16589633b65af7" dependencies = [ "siphasher", ] [[package]] name = "phf_shared" version = "0.11.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "90fcb95eef784c2ac79119d1dd819e162b5da872ce6f3c3abe1e8ca1c082f72b" dependencies = [ "siphasher", ] [[package]] name = "ppv-lite86" version = "0.2.17" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "5b40af805b3121feab8a3c29f04d8ad262fa8e0561883e7653e024ae4479e6de" [[package]] name = "precomputed-hash" version = "0.1.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "925383efa346730478fb4838dbe9137d2a47675ad789c546d150a6e1dd4ab31c" [[package]] name = "pretty_assertions" version = "0.7.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "1cab0e7c02cf376875e9335e0ba1da535775beb5450d21e1dffca068818ed98b" dependencies = [ "ansi_term", "ctor", "diff", "output_vt100", ] [[package]] name = "proc-macro2" version = "1.0.86" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "5e719e8df665df0d1c8fbfd238015744736151d4445ec0836b8e628aae103b77" dependencies = [ "unicode-ident", ] [[package]] name = "quote" version = "1.0.36" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "0fa76aaf39101c457836aec0ce2316dbdc3ab723cdda1c6bd4e6ad4208acaca7" dependencies = [ "proc-macro2", ] [[package]] name = "rand" version = "0.4.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "eba5f8cb59cc50ed56be8880a5c7b496bfd9bd26394e176bc67884094145c2c5" dependencies = [ "fuchsia-zircon", "libc", "winapi", ] [[package]] name = "rand" version = "0.7.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "3ae1b169243eaf61759b8475a998f0a385e42042370f3a7dbaf35246eacc8412" dependencies = [ "getrandom", "libc", "rand_chacha", "rand_core 0.5.1", "rand_hc", "rand_pcg", ] [[package]] name = "rand" version = "0.8.5" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "34af8d1a0e25924bc5b7c43c079c942339d8f0a8b57c39049bef581b46327404" dependencies = [ "rand_core 0.6.4", ] [[package]] name = "rand_chacha" version = "0.2.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "03a2a90da8c7523f554344f921aa97283eadf6ac484a6d2a7d0212fa7f8d6853" dependencies = [ "c2-chacha", "rand_core 0.5.1", ] [[package]] name = "rand_core" version = "0.5.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "90bde5296fc891b0cef12a6d03ddccc162ce7b2aff54160af9338f8d40df6d19" dependencies = [ "getrandom", ] [[package]] name = "rand_core" version = "0.6.4" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ec0be4795e2f6a28069bec0b5ff3e2ac9bafc99e6a9a7dc3547996c5c816922c" [[package]] name = "rand_hc" version = "0.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ca3129af7b92a17112d59ad498c6f81eaf463253766b90396d39ea7a39d6613c" dependencies = [ "rand_core 0.5.1", ] [[package]] name = "rand_pcg" version = "0.2.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "16abd0c1b639e9eb4d7c50c0b8100b0d0f849be2349829c740fe8e6eb4816429" dependencies = [ "rand_core 0.5.1", ] [[package]] name = "regex" version = "1.4.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "38cf2c13ed4745de91a5eb834e11c00bcc3709e773173b2ce4c56c9fbde04b9c" dependencies = [ "aho-corasick", "memchr", "regex-syntax", "thread_local", ] [[package]] name = "regex-syntax" version = "0.6.21" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "3b181ba2dcf07aaccad5448e8ead58db5b742cf85dfe035e2227f137a539a189" [[package]] name = "rustc-serialize" version = "0.3.24" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "dcf128d1287d2ea9d80910b5f1120d0b8eede3fbf1abe91c40d39ea7d51e6fda" [[package]] name = "same-file" version = "1.0.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "cfb6eded0b06a0b512c8ddbcf04089138c9b4362c2f696f3c3d76039d68f3637" dependencies = [ "winapi", ] [[package]] name = "serde" version = "1.0.33" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "4fe95aa0d46f04ce5c3a88bdcd4114ecd6144ed0b2725ebca2f1127744357807" [[package]] name = "siphasher" version = "0.3.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "83da420ee8d1a89e640d0948c646c1c088758d3a3c538f943bfa97bdac17929d" [[package]] name = "spectral" version = "0.6.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ae3c15181f4b14e52eeaac3efaeec4d2764716ce9c86da0c934c3e318649c5ba" dependencies = [ "num", ] [[package]] name = "string_cache" version = "0.8.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "2940c75beb4e3bf3a494cef919a747a2cb81e52571e212bfbd185074add7208a" dependencies = [ "lazy_static", "new_debug_unreachable", "phf_shared 0.8.0", "precomputed-hash", "serde", ] [[package]] name = "string_cache_codegen" version = "0.5.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "f24c8e5e19d22a726626f1a5e16fe15b132dcf21d10177fa5a45ce7962996b97" dependencies = [ "phf_generator 0.8.0", "phf_shared 0.8.0", "proc-macro2", "quote", ] [[package]] name = "syn" version = "1.0.72" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "a1e8cdbefb79a9a5a65e0db8b47b723ee907b7c7f8496c76a1770b5c310bab82" dependencies = [ "proc-macro2", "quote", "unicode-xid", ] [[package]] name = "syn" version = "2.0.68" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "901fa70d88b9d6c98022e23b4136f9f3e54e4662c3bc1bd1d84a42a9a0f0c1e9" dependencies = [ "proc-macro2", "quote", "unicode-ident", ] [[package]] name = "tendril" version = "0.4.3" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d24a120c5fc464a3458240ee02c299ebcb9d67b5249c8848b09d639dca8d7bb0" dependencies = [ "futf", "mac", "utf-8", ] [[package]] name = "thiserror" version = "1.0.24" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e0f4a65597094d4483ddaed134f409b2cb7c1beccf25201a9f73c719254fa98e" dependencies = [ "thiserror-impl", ] [[package]] name = "thiserror-impl" version = "1.0.24" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "7765189610d8241a44529806d6fd1f2e0a08734313a35d5b3a556f92b381f3c0" dependencies = [ "proc-macro2", "quote", "syn 1.0.72", ] [[package]] name = "thread_local" version = "1.0.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d40c6d1b69745a6ec6fb1ca717914848da4b44ae29d9b3080cbee91d72a69b14" dependencies = [ "lazy_static", ] [[package]] name = "unicode-ident" version = "1.0.12" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "3354b9ac3fae1ff6755cb6db53683adb661634f67557942dea4facebec0fee4b" [[package]] name = "unicode-xid" version = "0.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "826e7639553986605ec5979c7dd957c7895e93eabed50ab2ffa7f6128a75097c" [[package]] name = "unindent" version = "0.1.7" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "f14ee04d9415b52b3aeab06258a3f07093182b88ba0f9b8d203f211a7a7d41c7" [[package]] name = "utf-8" version = "0.7.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "f1262dfab4c30d5cb7c07026be00ee343a6cf5027fdc0104a9160f354e5db75c" [[package]] name = "walkdir" version = "2.1.4" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "63636bd0eb3d00ccb8b9036381b526efac53caf112b7783b730ab3f8e44da369" dependencies = [ "same-file", "winapi", ] [[package]] name = "wasi" version = "0.7.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "b89c3ce4ce14bdc6fb6beaf9ec7928ca331de5df7e5ea278375642a2f478570d" [[package]] name = "winapi" version = "0.3.9" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419" dependencies = [ "winapi-i686-pc-windows-gnu", "winapi-x86_64-pc-windows-gnu", ] [[package]] name = "winapi-i686-pc-windows-gnu" version = "0.4.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6" [[package]] name = "winapi-x86_64-pc-windows-gnu" version = "0.4.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f" [[package]] name = "xml5ever" version = "0.18.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "9bbb26405d8e919bc1547a5aa9abc95cbfa438f04844f5fdd9dc7596b748bf69" dependencies = [ "log", "mac", "markup5ever", ] html2md-0.2.15/Cargo.toml0000644000000031740000000000100104730ustar # THIS FILE IS AUTOMATICALLY GENERATED BY CARGO # # When uploading crates to the registry Cargo will automatically # "normalize" Cargo.toml files for maximal compatibility # with all versions of Cargo and also rewrite `path` dependencies # to registry (e.g., crates.io) dependencies. # # If you are reading this file be aware that the original Cargo.toml # will likely look very different (and much more reasonable). # See Cargo.toml.orig for the original contents. [package] edition = "2018" name = "html2md" version = "0.2.15" authors = ["Oleg `Kanedias` Chernovskiy "] description = "Library and binary to convert simple html documents into markdown" readme = "README.md" keywords = [ "html", "markdown", "converter", ] categories = [ "development-tools", "parsing", "parser-implementations", ] license = "GPL-3.0+" repository = "https://gitlab.com/Kanedias/html2md" [profile.release] lto = true debug = 0 panic = "abort" [lib] name = "html2md" crate-type = [ "rlib", "dylib", "staticlib", ] [dependencies.html5ever] version = "0.27.0" [dependencies.lazy_static] version = "1.4.0" [dependencies.markup5ever_rcdom] version = "0.3.0" [dependencies.percent-encoding] version = "2.1.0" [dependencies.regex] version = "1.4.2" [dev-dependencies.indoc] version = "1.0.3" [dev-dependencies.pretty_assertions] version = "0.7.2" [dev-dependencies.spectral] version = "0.6.0" [target."cfg(target_os=\"android\")".dependencies.jni] version = "0.19.0" default-features = false [badges.gitlab] branch = "master" repository = "Kanedias/html2md" [badges.maintenance] status = "experimental" html2md-0.2.15/Cargo.toml.orig000064400000000000000000000021101046102023000141410ustar 00000000000000[package] name = "html2md" version = "0.2.15" edition = "2018" authors = ["Oleg `Kanedias` Chernovskiy "] description = "Library and binary to convert simple html documents into markdown" repository = "https://gitlab.com/Kanedias/html2md" readme = "README.md" keywords = ["html", "markdown", "converter"] categories = ["development-tools", "parsing", "parser-implementations"] license = "GPL-3.0+" [badges] gitlab = { repository = "Kanedias/html2md", branch = "master" } maintenance = { status = "experimental" } [lib] name = "html2md" crate-type = ["rlib", "dylib", "staticlib"] [dependencies] # string_cache_codegen = "0.4.2" # Needed for markup5ever lazy_static = "1.4.0" html5ever = "0.27.0" markup5ever_rcdom = "0.3.0" regex = "1.4.2" percent-encoding = "2.1.0" [dev-dependencies] spectral = "0.6.0" pretty_assertions = "0.7.2" indoc = "1.0.3" [profile.release] debug = false lto = true panic = 'abort' # To use this project on Android we need JNI [target.'cfg(target_os="android")'.dependencies] jni = { version = "0.19.0", default-features = false } html2md-0.2.15/LICENSE000064400000000000000000001045131046102023000122710ustar 00000000000000 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . html2md-0.2.15/README.md000064400000000000000000000031551046102023000125430ustar 00000000000000HTML2MD ======= Library to convert simple html documents into markdown flavor. Implements markdown as written on its [inception page](https://daringfireball.net/projects/markdown). Features -------- Currently supported: + Lists (and inner lists) + Headers + Quotes (and inner quotes) + Paragraphs + Horizontal rulers + Images and links + Tables + Formatting (bold, italic, strikethrough, underline) + Code Limitations ----------- - no markdown flavors support (-/+ unordered list styles, ##/== headers etc.) - doesn't yet detect code style Used libraries -------------- [html5ever](https://github.com/servo/html5ever) - Servo engine HTML parsing library, used to convert html input to DOM [regex](https://github.com/rust-lang/regex) - PCRE support in Rust, used to correct whitespaces Contributions ------------- You may create merge request or bug/enhancement issue right here on GitLab, or send formatted patch via e-mail. For details see CONTRIBUTING.md file in this repo. License ------------- Copyright (C) 2018-2019 Oleg `Kanedias` Chernovskiy This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. See [COPYING.md](./COPYING.md) for special terms on dual-licensing. html2md-0.2.15/src/anchors.rs000064400000000000000000000030301046102023000140460ustar 00000000000000use crate::common::get_tag_attr; use crate::dummy::IdentityHandler; use super::TagHandler; use super::StructuredPrinter; use markup5ever_rcdom::{Handle,NodeData}; #[derive(Default)] pub struct AnchorHandler { start_pos: usize, url: String, emit_unchanged: bool, } impl TagHandler for AnchorHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { // Check for a `name` attribute. If it exists, we can't support this // in markdown, so we must emit this tag unchanged. if get_tag_attr(tag, "name").is_some() { let mut identity = IdentityHandler::default(); identity.handle(tag, printer); self.emit_unchanged = true; } self.start_pos = printer.data.len(); // try to extract a hyperlink self.url = match tag.data { NodeData::Element { ref attrs, .. } => { let attrs = attrs.borrow(); let href = attrs.iter().find(|attr| attr.name.local.to_string() == "href"); match href { Some(link) => link.value.to_string(), None => String::new() } } _ => String::new() }; } fn after_handle(&mut self, printer: &mut StructuredPrinter) { if !self.emit_unchanged { // add braces around already present text, put an url afterwards printer.insert_str(self.start_pos, "["); printer.append_str(&format!("]({})", self.url)) } } } html2md-0.2.15/src/bin/html2md.rs000064400000000000000000000004401046102023000145320ustar 00000000000000extern crate html2md; use std::io::{self, Read}; fn main() { let stdin = io::stdin(); let mut buffer = String::new(); let mut handle = stdin.lock(); handle.read_to_string(&mut buffer).expect("Must be readable HTML!"); println!("{}", html2md::parse_html(&buffer)); }html2md-0.2.15/src/codes.rs000064400000000000000000000026241046102023000135160ustar 00000000000000use super::TagHandler; use super::StructuredPrinter; use markup5ever_rcdom::{Handle,NodeData}; #[derive(Default)] pub struct CodeHandler { code_type: String } impl CodeHandler { /// Used in both starting and finishing handling fn do_handle(&mut self, printer: &mut StructuredPrinter, start: bool) { let immediate_parent = printer.parent_chain.last().unwrap().to_owned(); if self.code_type == "code" && immediate_parent == "pre" { // we are already in "code" mode return; } match self.code_type.as_ref() { "pre" => { // code block should have its own paragraph if start { printer.insert_newline(); } printer.append_str("\n```\n"); if !start { printer.insert_newline(); } }, "code" | "samp" => printer.append_str("`"), _ => {} } } } impl TagHandler for CodeHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { self.code_type = match tag.data { NodeData::Element { ref name, .. } => name.local.to_string(), _ => String::new() }; self.do_handle(printer, true); } fn after_handle(&mut self, printer: &mut StructuredPrinter) { self.do_handle(printer, false); } } html2md-0.2.15/src/common.rs000064400000000000000000000006651046102023000137140ustar 00000000000000use markup5ever_rcdom::{Handle,NodeData}; pub fn get_tag_attr(tag: &Handle, attr_name: &str) -> Option { match tag.data { NodeData::Element { ref attrs, .. } => { let attrs = attrs.borrow(); let requested_attr = attrs.iter().find(|attr| attr.name.local.to_string() == attr_name); return requested_attr.map(|attr| attr.value.to_string()); } _ => return None } }html2md-0.2.15/src/containers.rs000064400000000000000000000007221046102023000145630ustar 00000000000000use super::TagHandler; use super::StructuredPrinter; use markup5ever_rcdom::Handle; #[derive(Default)] pub struct ContainerHandler; impl TagHandler for ContainerHandler { fn handle(&mut self, _tag: &Handle, printer: &mut StructuredPrinter) { printer.insert_newline(); printer.insert_newline(); } fn after_handle(&mut self, printer: &mut StructuredPrinter) { printer.insert_newline(); printer.insert_newline(); } }html2md-0.2.15/src/dummy.rs000064400000000000000000000045551046102023000135610ustar 00000000000000use super::TagHandler; use super::StructuredPrinter; use html5ever::serialize; use html5ever::serialize::{SerializeOpts, TraversalScope}; use markup5ever_rcdom::{Handle, NodeData, SerializableHandle}; #[derive(Default)] pub struct DummyHandler; impl TagHandler for DummyHandler { fn handle(&mut self, _tag: &Handle, _printer: &mut StructuredPrinter) { } fn after_handle(&mut self, _printer: &mut StructuredPrinter) { } } /// Handler that completely copies tag to printer as HTML with all descendants #[derive(Default)] pub(super) struct IdentityHandler; impl TagHandler for IdentityHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { let mut buffer = vec![]; let options = SerializeOpts { traversal_scope: TraversalScope::IncludeNode, .. Default::default() }; let to_be_serialized = SerializableHandle::from(tag.clone()); let result = serialize(&mut buffer, &to_be_serialized, options); if result.is_err() { // couldn't serialize the tag return; } let conv = String::from_utf8(buffer); if conv.is_err() { // is non-utf8 string possible in html5ever? return; } printer.append_str(&conv.unwrap()); } fn skip_descendants(&self) -> bool { return true; } fn after_handle(&mut self, _printer: &mut StructuredPrinter) { } } /// Handler that copies just one tag and doesn't skip descendants #[derive(Default)] pub struct HtmlCherryPickHandler { tag_name: String } impl TagHandler for HtmlCherryPickHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { match tag.data { NodeData::Element { ref name, ref attrs, .. } => { let attrs = attrs.borrow(); self.tag_name = name.local.to_string(); printer.append_str(&format!("<{}", self.tag_name)); for attr in attrs.iter() { printer.append_str(&format!(" {}=\"{}\"", attr.name.local, attr.value)); } printer.append_str(">"); } _ => return } } fn skip_descendants(&self) -> bool { return false; } fn after_handle(&mut self, printer: &mut StructuredPrinter) { printer.append_str(&format!("", self.tag_name)); } }html2md-0.2.15/src/headers.rs000064400000000000000000000024111046102023000140260ustar 00000000000000use super::TagHandler; use super::StructuredPrinter; use markup5ever_rcdom::{Handle,NodeData}; #[derive(Default)] pub struct HeaderHandler { header_type: String, } impl TagHandler for HeaderHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { self.header_type = match tag.data { NodeData::Element { ref name, .. } => name.local.to_string(), _ => String::new() }; printer.insert_newline(); printer.insert_newline(); match self.header_type.as_ref() { "h3" => printer.append_str("### "), "h4" => printer.append_str("#### "), "h5" => printer.append_str("##### "), "h6" => printer.append_str("###### "), _ => {} } } fn after_handle(&mut self, printer: &mut StructuredPrinter) { match self.header_type.as_ref() { "h1" => printer.append_str("\n==========\n"), "h2" => printer.append_str("\n----------\n"), "h3" => printer.append_str(" ###\n"), "h4" => printer.append_str(" ####\n"), "h5" => printer.append_str(" #####\n"), "h6" => printer.append_str(" ######\n"), _ => {} } printer.insert_newline(); } }html2md-0.2.15/src/iframes.rs000064400000000000000000000067201046102023000140500ustar 00000000000000use lazy_static::lazy_static; use super::TagHandler; use super::StructuredPrinter; use crate::common::get_tag_attr; use crate::dummy::IdentityHandler; use regex::Regex; use markup5ever_rcdom::Handle; lazy_static! { /// Pattern that detects iframes with Youtube embedded videos
/// Examples: /// * `https://www.youtube.com/embed/zE-dmXZp3nU?wmode=opaque` /// * `https://www.youtube-nocookie.com/embed/5yo6exIypkY` /// * `https://www.youtube.com/embed/TXm6IXrbQuM` static ref YOUTUBE_PATTERN : Regex = Regex::new(r"www\.youtube(?:-nocookie)?\.com/embed/([-\w]+)").unwrap(); /// Pattern that detects iframes with Instagram embedded photos
/// Examples: /// * `https://www.instagram.com/p/B1BKr9Wo8YX/embed/` /// * `https://www.instagram.com/p/BpKjlo-B4uI/embed/` static ref INSTAGRAM_PATTERN: Regex = Regex::new(r"www\.instagram\.com/p/([-\w]+)/embed").unwrap(); /// Patter that detects iframes with VKontakte embedded videos
/// Examples: /// * `https://vk.com/video_ext.php?oid=-49423435&id=456245092&hash=e1611aefe899c4f8` /// * `https://vk.com/video_ext.php?oid=-76477496&id=456239454&hash=ebfdc2d386617b97` static ref VK_PATTERN: Regex = Regex::new(r"vk\.com/video_ext\.php\?oid=(-?\d+)&id=(\d+)&hash=(.*)").unwrap(); static ref YANDEX_MUSIC_TRACK_PATTERN: Regex = Regex::new(r"https://music.yandex.ru/iframe/#track/(\d+)/(\d+)").unwrap(); static ref YANDEX_MUSIC_ALBUM_PATTERN: Regex = Regex::new(r"https://music.yandex.ru/iframe/#album/(\d+)").unwrap(); } #[derive(Default)] pub struct IframeHandler; impl TagHandler for IframeHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { printer.insert_newline(); printer.insert_newline(); let src = get_tag_attr(tag, "src"); //let width = get_tag_attr(tag, "width"); //let height = get_tag_attr(tag, "height"); if src == None { return; } let src = src.unwrap(); if let Some(capture) = YOUTUBE_PATTERN.captures(&src) { let media_id = capture.get(1).map_or("", |m| m.as_str()); printer.append_str(&format!("[![Embedded YouTube video](https://img.youtube.com/vi/{mid}/0.jpg)](https://www.youtube.com/watch?v={mid})", mid = media_id)); return } if let Some(capture) = INSTAGRAM_PATTERN.captures(&src) { let media_id = capture.get(1).map_or("", |m| m.as_str()); printer.append_str(&format!("[![Embedded Instagram post](https://www.instagram.com/p/{mid}/media/?size=m)](https://www.instagram.com/p/{mid}/embed/)", mid = media_id)); return } if let Some(capture) = VK_PATTERN.captures(&src) { let owner_id = capture.get(1).map_or("", |m| m.as_str()); let video_id = capture.get(2).map_or("", |m| m.as_str()); let _hash = capture.get(3).map_or("", |m| m.as_str()); printer.append_str(&format!("[![Embedded VK video](https://st.vk.com/images/icons/video_empty_2x.png)](https://vk.com/video{oid}_{vid})", oid = owner_id, vid = video_id)); return } // not found, use generic implementation let mut identity = IdentityHandler::default(); identity.handle(tag, printer); } fn after_handle(&mut self, printer: &mut StructuredPrinter) { printer.insert_newline(); printer.insert_newline(); } fn skip_descendants(&self) -> bool { return true; } }html2md-0.2.15/src/images.rs000064400000000000000000000045621046102023000136710ustar 00000000000000use super::TagHandler; use super::StructuredPrinter; use crate::common::get_tag_attr; use crate::dummy::IdentityHandler; use markup5ever_rcdom::Handle; use percent_encoding::{utf8_percent_encode, AsciiSet, CONTROLS}; const FRAGMENT: &AsciiSet = &CONTROLS.add(b' ').add(b'"').add(b'<').add(b'>').add(b'`'); /// Handler for `` tag. Depending on circumstances can produce both /// inline HTML-formatted image and Markdown native one #[derive(Default)] pub struct ImgHandler { block_mode: bool } impl TagHandler for ImgHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { // hack: detect if the image has associated style and has display in block mode let style_tag = get_tag_attr(tag, "src"); if let Some(style) = style_tag { if style.contains("display: block") { self.block_mode = true } } if self.block_mode { // make image on new paragraph printer.insert_newline(); printer.insert_newline(); } // try to extract attrs let src = get_tag_attr(tag, "src"); let alt = get_tag_attr(tag, "alt"); let title = get_tag_attr(tag, "title"); let height = get_tag_attr(tag, "height"); let width = get_tag_attr(tag, "width"); let align = get_tag_attr(tag, "align"); if height.is_some() || width.is_some() || align.is_some() { // need to handle it as inline html to preserve attributes we support let mut identity = IdentityHandler::default(); identity.handle(tag, printer); } else { // need to escape URL if it contains spaces // don't have any geometry-controlling attrs, post markdown natively let mut img_url = src.unwrap_or_default(); if img_url.contains(' ') { img_url = utf8_percent_encode(&img_url, FRAGMENT).to_string(); } printer.append_str( &format!("![{}]({}{})", alt.unwrap_or_default(), &img_url, title.map(|value| format!(" \"{}\"", value)).unwrap_or_default())); } } fn after_handle(&mut self, printer: &mut StructuredPrinter) { if self.block_mode { printer.insert_newline(); printer.insert_newline(); } } }html2md-0.2.15/src/lib.rs000064400000000000000000000337251046102023000131750ustar 00000000000000use lazy_static::lazy_static; use std::boxed::Box; use std::collections::HashMap; use std::os::raw::c_char; use std::ffi::{CString, CStr}; use regex::Regex; use html5ever::parse_document; use html5ever::driver::ParseOpts; use html5ever::tendril::TendrilSink; pub use markup5ever_rcdom::{RcDom, Handle, NodeData}; pub mod common; pub mod dummy; pub mod anchors; pub mod paragraphs; pub mod images; pub mod headers; pub mod lists; pub mod styles; pub mod codes; pub mod quotes; pub mod tables; pub mod containers; pub mod iframes; use crate::dummy::DummyHandler; use crate::dummy::IdentityHandler; use crate::dummy::HtmlCherryPickHandler; use crate::paragraphs::ParagraphHandler; use crate::anchors::AnchorHandler; use crate::images::ImgHandler; use crate::headers::HeaderHandler; use crate::lists::ListItemHandler; use crate::lists::ListHandler; use crate::styles::StyleHandler; use crate::codes::CodeHandler; use crate::quotes::QuoteHandler; use crate::tables::TableHandler; use crate::containers::ContainerHandler; use crate::iframes::IframeHandler; lazy_static! { static ref EXCESSIVE_WHITESPACE_PATTERN: Regex = Regex::new("\\s{2,}").unwrap(); // for HTML on-the-fly cleanup static ref EMPTY_LINE_PATTERN: Regex = Regex::new("(?m)^ +$").unwrap(); // for Markdown post-processing static ref EXCESSIVE_NEWLINE_PATTERN: Regex = Regex::new("\\n{3,}").unwrap(); // for Markdown post-processing static ref TRAILING_SPACE_PATTERN: Regex = Regex::new("(?m)(\\S) $").unwrap(); // for Markdown post-processing static ref LEADING_NEWLINES_PATTERN: Regex = Regex::new("^\\n+").unwrap(); // for Markdown post-processing static ref LAST_WHITESPACE_PATTERN: Regex = Regex::new("\\s+$").unwrap(); // for Markdown post-processing static ref START_OF_LINE_PATTERN: Regex = Regex::new("(^|\\n) *$").unwrap(); // for Markdown escaping static ref MARKDOWN_STARTONLY_KEYCHARS: Regex = Regex::new(r"^(\s*)([=>+\-#])").unwrap(); // for Markdown escaping static ref MARKDOWN_MIDDLE_KEYCHARS: Regex = Regex::new(r"[<>*\\_~]").unwrap(); // for Markdown escaping } /// Custom variant of main function. Allows to pass custom tag<->tag factory pairs /// in order to register custom tag hadler for tags you want. /// /// You can also override standard tag handlers this way /// # Arguments /// `html` is source HTML as `String` /// `custom` is custom tag hadler producers for tags you want, can be empty pub fn parse_html_custom(html: &str, custom: &HashMap>) -> String { let dom = parse_document(RcDom::default(), ParseOpts::default()).from_utf8().read_from(&mut html.as_bytes()).unwrap(); let mut result = StructuredPrinter::default(); walk(&dom.document, &mut result, custom); return clean_markdown(&result.data); } /// Main function of this library. Parses incoming HTML, converts it into Markdown /// and returns converted string. /// # Arguments /// `html` is source HTML as `String` pub fn parse_html(html: &str) -> String { parse_html_custom(html, &HashMap::default()) } /// Same as `parse_html` but retains all "span" html elements intact /// Markdown parsers usually strip them down when rendering but they /// may be useful for later processing pub fn parse_html_extended(html: &str) -> String { struct SpanAsIsTagFactory; impl TagHandlerFactory for SpanAsIsTagFactory { fn instantiate(&self) -> Box { return Box::new(HtmlCherryPickHandler::default()); } } let mut tag_factory: HashMap> = HashMap::new(); tag_factory.insert(String::from("span"), Box::new(SpanAsIsTagFactory{})); return parse_html_custom(html, &tag_factory); } /// Recursively walk through all DOM tree and handle all elements according to /// HTML tag -> Markdown syntax mapping. Text content is trimmed to one whitespace according to HTML5 rules. /// /// # Arguments /// `input` is DOM tree or its subtree /// `result` is output holder with position and context tracking /// `custom` is custom tag hadler producers for tags you want, can be empty fn walk(input: &Handle, result: &mut StructuredPrinter, custom: &HashMap>) { let mut handler : Box = Box::new(DummyHandler::default()); let mut tag_name = String::default(); match input.data { NodeData::Document | NodeData::Doctype {..} | NodeData::ProcessingInstruction {..} => {}, NodeData::Text { ref contents } => { let mut text = contents.borrow().to_string(); let inside_pre = result.parent_chain.iter().any(|tag| tag == "pre"); if inside_pre { // this is preformatted text, insert as-is result.append_str(&text); } else if !(text.trim().len() == 0 && (result.data.chars().last() == Some('\n') || result.data.chars().last() == Some(' '))) { // in case it's not just a whitespace after the newline or another whitespace // regular text, collapse whitespace and newlines in text let inside_code = result.parent_chain.iter().any(|tag| tag == "code"); if !inside_code { text = escape_markdown(result, &text); } let minified_text = EXCESSIVE_WHITESPACE_PATTERN.replace_all(&text, " "); let minified_text = minified_text.trim_matches(|ch: char| ch == '\n' || ch == '\r'); result.append_str(&minified_text); } } NodeData::Comment { .. } => {}, // ignore comments NodeData::Element { ref name, .. } => { tag_name = name.local.to_string(); let inside_pre = result.parent_chain.iter().any(|tag| tag == "pre"); if inside_pre { // don't add any html tags inside the pre section handler = Box::new(DummyHandler::default()); }else if custom.contains_key(&tag_name) { // have user-supplied factory, instantiate a handler for this tag let factory = custom.get(&tag_name).unwrap(); handler = factory.instantiate(); } else { // no user-supplied factory, take one of built-in ones handler = match tag_name.as_ref() { // containers "div" | "section" | "header" | "footer" => Box::new(ContainerHandler::default()), // pagination, breaks "p" | "br" | "hr" => Box::new(ParagraphHandler::default()), "q" | "cite" | "blockquote" => Box::new(QuoteHandler::default()), // spoiler tag "details" | "summary" => Box::new(HtmlCherryPickHandler::default()), // formatting "b" | "i" | "s" | "strong" | "em" | "del" => Box::new(StyleHandler::default()), "h1" | "h2" | "h3" | "h4" | "h5" | "h6" => Box::new(HeaderHandler::default()), "pre" | "code" => Box::new(CodeHandler::default()), // images, links "img" => Box::new(ImgHandler::default()), "a" => Box::new(AnchorHandler::default()), // lists "ol" | "ul" | "menu" => Box::new(ListHandler::default()), "li" => Box::new(ListItemHandler::default()), // as-is "sub" | "sup" => Box::new(IdentityHandler::default()), // tables, handled fully internally as markdown can't have nested content in tables // supports only single tables as of now "table" => Box::new(TableHandler::default()), "iframe" => Box::new(IframeHandler::default()), // other "html" | "head" | "body" => Box::new(DummyHandler::default()), _ => Box::new(DummyHandler::default()) }; } } } // handle this tag, while it's not in parent chain // and doesn't have child siblings handler.handle(&input, result); // save this tag name as parent for child nodes result.parent_chain.push(tag_name.to_string()); // e.g. it was ["body"] and now it's ["body", "p"] let current_depth = result.parent_chain.len(); // e.g. it was 1 and now it's 2 // create space for siblings of next level result.siblings.insert(current_depth, vec![]); for child in input.children.borrow().iter() { if handler.skip_descendants() { continue; } walk(child, result, custom); match child.data { NodeData::Element { ref name, .. } => result.siblings.get_mut(¤t_depth).unwrap().push(name.local.to_string()), _ => {} }; } // clear siblings of next level result.siblings.remove(¤t_depth); // release parent tag result.parent_chain.pop(); // finish handling of tag - parent chain now doesn't contain this tag itself again handler.after_handle(result); } /// This conversion should only be applied to text tags /// /// Escapes text inside HTML tags so it won't be recognized as Markdown control sequence /// like list start or bold text style fn escape_markdown(result: &StructuredPrinter, text: &str) -> String { // always escape bold/italic/strikethrough let mut data = MARKDOWN_MIDDLE_KEYCHARS.replace_all(&text, "\\$0").to_string(); // if we're at the start of the line we need to escape list- and quote-starting sequences if START_OF_LINE_PATTERN.is_match(&result.data) { data = MARKDOWN_STARTONLY_KEYCHARS.replace(&data, "$1\\$2").to_string(); } // no handling of more complicated cases such as // ![] or []() ones, for now this will suffice return data; } /// Called after all processing has been finished /// /// Clears excessive punctuation that would be trimmed by renderer anyway fn clean_markdown(text: &str) -> String { // remove redundant newlines let intermediate = EMPTY_LINE_PATTERN.replace_all(&text, ""); // empty line with trailing spaces, replace with just newline let intermediate = EXCESSIVE_NEWLINE_PATTERN.replace_all(&intermediate, "\n\n"); // > 3 newlines - not handled by markdown anyway let intermediate = TRAILING_SPACE_PATTERN.replace_all(&intermediate, "$1"); // trim space if it's just one let intermediate = LEADING_NEWLINES_PATTERN.replace_all(&intermediate, ""); // trim leading newlines let intermediate = LAST_WHITESPACE_PATTERN.replace_all(&intermediate, ""); // trim last newlines return intermediate.into_owned(); } /// Intermediate result of HTML -> Markdown conversion. /// /// Holds context in the form of parent tags and siblings chain /// and resulting string of markup content with current position. #[derive(Debug, Default)] pub struct StructuredPrinter { /// Chain of parents leading to upmost tag pub parent_chain: Vec, /// Siblings of currently processed tag in order where they're appearing in html pub siblings: HashMap>, /// resulting markdown document pub data: String, } impl StructuredPrinter { /// Inserts newline pub fn insert_newline(&mut self) { self.append_str("\n"); } /// Append string to the end of the printer pub fn append_str(&mut self, it: &str) { self.data.push_str(it); } /// Insert string at specified position of printer, adjust position to the end of inserted string pub fn insert_str(&mut self, pos: usize, it: &str) { self.data.insert_str(pos, it); } } /// Tag handler factory. This class is required in providing proper /// custom tag parsing capabilities to users of this library. /// /// The problem with directly providing tag handlers is that they're not stateless. /// Once tag handler is parsing some tag, it holds data, such as start position, indent etc. /// The only way to create fresh tag handler for each tag is to provide a factory like this one. /// pub trait TagHandlerFactory { fn instantiate(&self) -> Box; } /// Trait interface describing abstract handler of arbitrary HTML tag. pub trait TagHandler { /// Handle tag encountered when walking HTML tree. /// This is executed before the children processing fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter); /// Executed after all children of this tag have been processed fn after_handle(&mut self, printer: &mut StructuredPrinter); fn skip_descendants(&self) -> bool { return false; } } /// FFI variant for HTML -> Markdown conversion for calling from other languages #[no_mangle] pub extern fn parse(html: *const c_char) -> *const c_char { let in_html = unsafe { CStr::from_ptr(html) }; let out_md = parse_html(&in_html.to_string_lossy()); CString::new(out_md).unwrap().into_raw() } /// Expose the JNI interface for android below #[cfg(target_os="android")] #[allow(non_snake_case)] pub mod android { extern crate jni; use super::parse_html; use super::parse_html_extended; use self::jni::JNIEnv; use self::jni::objects::{JClass, JString}; use self::jni::sys::jstring; #[no_mangle] pub unsafe extern fn Java_com_kanedias_html2md_Html2Markdown_parse(env: JNIEnv, _clazz: JClass, html: JString) -> jstring { let html_java : String = env.get_string(html).expect("Couldn't get java string!").into(); let markdown = parse_html(&html_java); let output = env.new_string(markdown).expect("Couldn't create java string!"); output.into_inner() } #[no_mangle] pub unsafe extern fn Java_com_kanedias_html2md_Html2Markdown_parseExtended(env: JNIEnv, _clazz: JClass, html: JString) -> jstring { let html_java : String = env.get_string(html).expect("Couldn't get java string!").into(); let markdown = parse_html_extended(&html_java); let output = env.new_string(markdown).expect("Couldn't create java string!"); output.into_inner() } } html2md-0.2.15/src/lists.rs000064400000000000000000000062771046102023000135670ustar 00000000000000use super::TagHandler; use super::StructuredPrinter; use markup5ever_rcdom::Handle; /// gets all list elements registered by a `StructuredPrinter` in reverse order fn list_hierarchy(printer: &mut StructuredPrinter) -> Vec<&String> { printer.parent_chain.iter().rev().filter(|&tag| tag == "ul" || tag == "ol" || tag == "menu").collect() } #[derive(Default)] pub struct ListHandler; impl TagHandler for ListHandler { /// we're entering "ul" or "ol" tag, no "li" handling here fn handle(&mut self, _tag: &Handle, printer: &mut StructuredPrinter) { printer.insert_newline(); // insert an extra newline for non-nested lists if list_hierarchy(printer).is_empty() { printer.insert_newline(); } } /// indent now-ready list fn after_handle(&mut self, printer: &mut StructuredPrinter) { printer.insert_newline(); printer.insert_newline(); } } #[derive(Default)] pub struct ListItemHandler { start_pos: usize, list_type: String } impl TagHandler for ListItemHandler { fn handle(&mut self, _tag: &Handle, printer: &mut StructuredPrinter) { { let parent_lists = list_hierarchy(printer); let nearest_parent_list = parent_lists.first(); if nearest_parent_list.is_none() { // no parent list // should not happen - html5ever cleans html input when parsing return; } self.list_type = nearest_parent_list.unwrap().to_string(); } if printer.data.chars().last() != Some('\n') { // insert newline when declaring a list item only in case there isn't any newline at the end of text printer.insert_newline(); } let current_depth = printer.parent_chain.len(); let order = printer.siblings[¤t_depth].len() + 1; match self.list_type.as_ref() { "ul" | "menu" => printer.append_str("* "), // unordered list: *, *, * "ol" => printer.append_str(&(order.to_string() + ". ")), // ordered list: 1, 2, 3 _ => {} // never happens } self.start_pos = printer.data.len(); } fn after_handle(&mut self, printer: &mut StructuredPrinter) { let padding = match self.list_type.as_ref() { "ul" => 2, "ol" => 3, _ => 4 }; // need to cleanup leading newlines,

inside

  • should produce valid // list element, not an empty line let index = self.start_pos; while index < printer.data.len() { if printer.data.bytes().nth(index) == Some(b'\n') || printer.data.bytes().nth(index) == Some(b' ') { printer.data.remove(index); } else { break; } } // non-nested indentation (padding). Markdown requires that all paragraphs in the // list item except first should be indented with at least 1 space let mut index = printer.data.len(); while index > self.start_pos { if printer.data.bytes().nth(index) == Some(b'\n') { printer.insert_str(index + 1, &" ".repeat(padding)); } index -= 1; } } }html2md-0.2.15/src/paragraphs.rs000064400000000000000000000021011046102023000145370ustar 00000000000000use super::TagHandler; use super::StructuredPrinter; use markup5ever_rcdom::{Handle,NodeData}; #[derive(Default)] pub struct ParagraphHandler { paragraph_type: String } impl TagHandler for ParagraphHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { self.paragraph_type = match tag.data { NodeData::Element { ref name, .. } => name.local.to_string(), _ => String::new() }; // insert newlines at the start of paragraph match self.paragraph_type.as_ref() { "p" => { printer.insert_newline(); printer.insert_newline(); } _ => {} } } fn after_handle(&mut self, printer: &mut StructuredPrinter) { // insert newlines at the end of paragraph match self.paragraph_type.as_ref() { "p" => { printer.insert_newline(); printer.insert_newline(); } "hr" => { printer.insert_newline(); printer.append_str("---"); printer.insert_newline(); } "br" => printer.append_str(" \n"), _ => {} } } }html2md-0.2.15/src/quotes.rs000064400000000000000000000015551046102023000137430ustar 00000000000000use super::TagHandler; use super::StructuredPrinter; use markup5ever_rcdom::Handle; #[derive(Default)] pub struct QuoteHandler { start_pos: usize } impl TagHandler for QuoteHandler { fn handle(&mut self, _tag: &Handle, printer: &mut StructuredPrinter) { self.start_pos = printer.data.len(); printer.insert_newline(); } fn after_handle(&mut self, printer: &mut StructuredPrinter) { // replace all newlines with newline + > let quote = "> "; let mut index = printer.data.len(); while index > self.start_pos { if printer.data.bytes().nth(index) == Some(b'\n') { printer.insert_str(index + 1, "e); } index -= 1; } printer.insert_str(self.start_pos + 1, "e); printer.insert_newline(); printer.insert_newline(); } }html2md-0.2.15/src/styles.rs000064400000000000000000000033511046102023000137420ustar 00000000000000use super::TagHandler; use super::StructuredPrinter; use markup5ever_rcdom::{Handle,NodeData}; #[derive(Default)] pub struct StyleHandler { start_pos: usize, style_type: String } /// Applies givem `mark` at both start and end indices, updates printer position to the end of text fn apply_at_bounds(printer: &mut StructuredPrinter, start: usize, end: usize, mark: &str) { printer.data.insert_str(end, mark); printer.data.insert_str(start, mark); } impl TagHandler for StyleHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { self.start_pos = printer.data.len(); self.style_type = match tag.data { NodeData::Element { ref name, .. } => name.local.to_string(), _ => String::new() }; } fn after_handle(&mut self, printer: &mut StructuredPrinter) { let non_space_offset = printer.data[self.start_pos..].find(|ch: char| !ch.is_whitespace()); if non_space_offset.is_none() { // only spaces or no text at all return; } let first_non_space_pos = self.start_pos + non_space_offset.unwrap(); let last_non_space_pos = printer.data.trim_end_matches(|ch: char| ch.is_whitespace()).len(); // finishing markup match self.style_type.as_ref() { "b" | "strong" => apply_at_bounds(printer, first_non_space_pos, last_non_space_pos, "**"), "i" | "em" => apply_at_bounds(printer, first_non_space_pos, last_non_space_pos, "*"), "s" | "del" => apply_at_bounds(printer, first_non_space_pos, last_non_space_pos, "~~"), "u" | "ins" => apply_at_bounds(printer, first_non_space_pos, last_non_space_pos, "__"), _ => {} } } }html2md-0.2.15/src/tables.rs000064400000000000000000000166111046102023000136740ustar 00000000000000use super::{walk, clean_markdown}; use super::TagHandler; use super::StructuredPrinter; use std::{collections::HashMap, cmp}; use markup5ever_rcdom::{Handle,NodeData}; #[derive(Default)] pub struct TableHandler; impl TagHandler for TableHandler { fn handle(&mut self, tag: &Handle, printer: &mut StructuredPrinter) { let mut table_markup = String::new(); let any_matcher = |cell: &Handle| { let name = tag_name(cell); name == "td" || name == "th" }; // detect cell width, counts let column_count : usize; let mut column_widths : Vec; let rows = find_children(tag, "tr"); { // detect row count let most_big_row = rows.iter().max_by(|left, right| collect_children(&left, any_matcher).len().cmp(&collect_children(&right, any_matcher).len())); if most_big_row.is_none() { // we don't have rows with content at all return; } // have rows with content, set column count column_count = collect_children(&most_big_row.unwrap(), any_matcher).len(); column_widths = vec![3; column_count]; // detect max column width for row in &rows { let cells = collect_children(row, any_matcher); for index in 0..column_count { // from regular rows if let Some(cell) = cells.get(index) { let text = to_text(cell); column_widths[index] = cmp::max(column_widths[index], text.chars().count()); } } } } // header row must always be present for (idx, row) in rows.iter().enumerate() { table_markup.push('|'); let cells = collect_children(row, any_matcher); for index in 0..column_count { // we need to fill all cells in a column, even if some rows don't have enough let padded_cell_text = pad_cell_text(&cells.get(index), column_widths[index]); table_markup.push_str(&padded_cell_text); table_markup.push('|'); } table_markup.push('\n'); if idx == 0 { // first row is a header row // add header-body divider row table_markup.push('|'); for index in 0..column_count { let width = column_widths[index]; if width < 3 { // no point in aligning, just post as-is table_markup.push_str(&"-".repeat(width)); table_markup.push('|'); continue; } // try to detect alignment let mut alignment = String::new(); if let Some(header_cell) = cells.get(index) { // we have a header, try to extract alignment from it alignment = match header_cell.data { NodeData::Element { ref attrs, .. } => { let attrs = attrs.borrow(); let align_attr = attrs.iter().find(|attr| attr.name.local.to_string() == "align"); align_attr.map(|attr| attr.value.to_string()).unwrap_or_default() } _ => String::new() }; } // push lines according to alignment, fallback to default behaviour match alignment.as_ref() { "left" => { table_markup.push(':'); table_markup.push_str(&"-".repeat(width - 1)); } "center" => { table_markup.push(':'); table_markup.push_str(&"-".repeat(width - 2)); table_markup.push(':'); } "right" => { table_markup.push_str(&"-".repeat(width - 1)); table_markup.push(':'); } _ => table_markup.push_str(&"-".repeat(width)) } table_markup.push('|'); } table_markup.push('\n'); } } printer.insert_newline(); printer.insert_newline(); printer.append_str(&table_markup); } fn after_handle(&mut self, _printer: &mut StructuredPrinter) { } fn skip_descendants(&self) -> bool { return true; } } /// Pads cell text from right and left so it looks centered inside the table cell /// ### Arguments /// `tag` - optional reference to currently processed handle, text is extracted from here /// /// `column_width` - precomputed column width to compute padding length from fn pad_cell_text(tag: &Option<&Handle>, column_width: usize) -> String { let mut result = String::new(); if let Some(cell) = tag { // have header at specified position let text = to_text(cell); // compute difference between width and text length let len_diff = column_width - text.chars().count(); if len_diff > 0 { // should pad if len_diff > 1 { // should pad from both sides let pad_len = len_diff / 2; let remainder = len_diff % 2; result.push_str(&" ".repeat(pad_len)); result.push_str(&text); result.push_str(&" ".repeat(pad_len + remainder)); } else { // it's just one space, add at the end result.push_str(&text); result.push(' '); } } else { // shouldn't pad, text fills whole cell result.push_str(&text); } } else { // no text in this cell, fill cell with spaces let pad_len = column_width; result.push_str(&" ".repeat(pad_len)); } return result; } /// Extracts tag name from passed tag /// Returns empty string if it's not an html element fn tag_name(tag: &Handle) -> String { return match tag.data { NodeData::Element { ref name, .. } => name.local.to_string(), _ => String::new() } } /// Find descendants of this tag with tag name `name` /// This includes both direct children and descendants fn find_children(tag: &Handle, name: &str) -> Vec { let mut result: Vec = vec![]; let children = tag.children.borrow(); for child in children.iter() { if tag_name(&child) == name { result.push(child.clone()); } let mut descendants = find_children(&child, name); result.append(&mut descendants); } return result; } /// Collect direct children that satisfy the predicate /// This doesn't include descendants fn collect_children

    (tag: &Handle, predicate: P) -> Vec where P: Fn(&Handle) -> bool { let mut result: Vec = vec![]; let children = tag.children.borrow(); for child in children.iter() { let candidate = child.clone(); if predicate(&candidate) { result.push(candidate); } } return result; } /// Convert html tag to text. This collects all tag children in correct order where they're observed /// and concatenates their text, recursively. fn to_text(tag: &Handle) -> String { let mut printer = StructuredPrinter::default(); walk(tag, &mut printer, &HashMap::default()); let result = clean_markdown(&printer.data); return result.replace("\n", "
    "); }html2md-0.2.15/test-samples/dybr-bug-with-list-newlines.html000064400000000000000000000044741046102023000220570ustar 00000000000000

    xx, xx xxxxx x xxxxxx xxxxxxxx xxxxx xxxxxxxxx xxxx xx xxxx xxxx xxxxxxxx.

    xxxx, xxx xx xxxxx xx xxxxxxxxxxx xxxx.

    xxxxxxxxxxx:
    • xxxxxxx x xxxxxxxxx (xxxxx)
    • xxxxxxx xx xxxxxx xxxxxxx, xxxxxxxxxx xxxxxxxxxx xxxx
    • xxxxxxxxx xx xxxxx, xx xxxxxx xx xxxxxxxxxxx
    • xxxxxxx xxxxxx xxxxxxxxx x xxxxxxxxxx, xxxxxxx xxxxxx x xxxxxxx, x xxxxxx.
    • xx xx, xxxxxx xx xxxxxxxx, xx-xxxx xxx x xxxxxxx xxx xxx, xxxxxxx xx xxxx. xxxxxxxxx xx x.

    xxxxx:
    1. xxxxxxxxx xxxxxxxxxx - xxxxx -_- !
    2. xxxxxx Mother of Learning - xxxx, xxxxxxx, xxxxxxxxxxxx
    3. xxxxxx xxxxxxx xxxxxxx, xxxxxxxx "xxx xxxxx". xxxxx xxxxx xxxx, xx x xxxxx xxxxxxx.
    4. xxxxxxxx! xxxx xxx xxxxxxxxx xxxx xxx, xx x xxxxxxxxx.
    5. xxxx xxxxxx - xxxxxx xxxxxxxx xxx x 15-17, xxxxxx xxxxxxxxxxxxx xx xxxxxxxx xxx xxxxxxx xxxxxx.
    xxx xxxx, xxxxx x xxxxxxxxx xx xxxxxxxxxx xxxxxx. xxxxxxxxx spelling puns, xxxxxxx, x xxxxxxxxx, xxxxxxxx xxx xxxxxxxx, xxxxxx xxxxxxxxxx xxxxxx.

    xxx xxxxxxx. xxx xxx xxxxxxxx xxxxxx - x x xxxxxxxxxxx xxxxx xxxx xxxxxxxxxx xxx xxxxx, x xxxxxx xxx xxxxxxxx xxxxxxxxxx xxx xxxxx. xx xxxxxx xxxxxxxx:
    • xxx xxxxx x xxx-xxxx xxxxxxxxx. xxxxxx xxx xxxx xxxxxxxx. x xx x xx xxxxxxxx, xx x xxxxxxx xxxxxx xxxxxx xx xxxxxxxxx. xxxxxxxxxx xxxx xxxxx xxxxxx xxxxxxxxx xxxxxxx xx xxxx.
    • xxxxxx xxxx Kotlin, x xxxxxxx. xxxxxxxxxx, xxxxxxxxxx xxx xxxxx xx xxx x xxxxxxxx
    • xxx xxxxx xxxxxxxxxx Rust, xxx xxx x xx xxx xxxx xxxxxxxxx xxxxxxxxxxxxxx xxxx  xxx xxxxx, xxxxxxxx xxxxxxxxxxxxxx HTML x Markdown
    • xxx xxxx xxxxxx xxx xxxxxxxx xxxxxx. xx xxxx xxx - xxxxxxxxxxxxx xxxxxxxxxxx xxxxxx x xxxxxxxxx xxxxx x xxxxxxx.
    • xxxxxxxxx xxxx xxxxxxxx xxxxxxx xx FUSE 3.0. xxxxx xxxxxxx xxxxxxx xxx xxxxxxxxxxx.
    • x xxxxxxxx xxxx xxxxxxxx DevOps-xxxxxxx x xxxxx xxxxxxx. xxxxxxxxx, xxx xx xxxxx xxxxxx. x, xx, xxx xxx xxx xxxxxxxxx?

    xxxxx xx xxx:
    - xxxxxxxx xxxxxxxx
    - xxxxxxx xxxxxxxxx, xxxxxxx xxxxx xxxxx xxxxxxxx
    - xxxxxxxxxx xxxx Machine Learning, xxxx xxxxxx xxx xxxxxxxx OpenCL.
    html2md-0.2.15/test-samples/dybr-bug-with-lists-from-text.html000064400000000000000000000026351046102023000223400ustar 00000000000000
    xxxx xxx xxxxxxxx xxxxxxx xxxx xxxxxxx xxxxxx xxx xx xxxxxxx
    - x xxxx xxxxx xx xxxxxxxxxx
    - x xxxx xxxxxxxx xxxxxxxxx xxxxxx xxx x xxxxxxxx xxxx
    - xxxx xxxxxxxx
    x xxxxxx xxx xxxxxxxx xxxxxxxxxx x xxx-xx xxxx xxxxx xxxxxxx, xxxxxxxxxxxx x xxxxxxx xx xxxxxxxx xxxxxxx. xxxxxxxx xxxx xxxxxxxx xxxxxxxxxxxx xxxxxx xxxxxxx xx xxxx xxxxx x xxxxx.
    xxx xxxx xxxxxxx x xxxxx xxxxxxxx xxxxxxxx. xxx xxxxxxxx xxxxx xxxxxx xxxxxx x xxxxxxx xxxxxx xxxxxxxxx xxx xx xxxxxxxxxxxxxx ( xxxxx xxxxxxxx SSE 2 xxxxxxx xxxx xxxxx ).
    x xxx xxxxx xxxxxxxx - xxxx xxxx xxxxx xx xxxxxx xxxxxxx - x xxxx xxx xxxxxxxx, xxxxxx xx x xx xxxxxxx. xxxxx xxxxxxxx xxx xxx "xxxxxxxx" xxx xxxxxxx.
    x xxxx xxxxxxxx xxxxxxxxxx xx xxxx xxxxxxx x xxxxx, xxx xxxxxxx xx xxx xxxxxxxxxx. xxxxxxxxx xxx xxxxxx - "x xxxxxx xx xxxxxxx x xxxxxxxxxxxx, xx xx xxxxx xxx xxxxx xxxxxxxxxxxxx - xxxxxx xxxxx, xx xxx x xxx xxxxxxx x x xxx xxx xxxxxxxxx xxxx xxxxxxxx. xxx xxxxxxxxxx xxxxxxx."
    xxxxx xxxxxx xxxxxxxxxx.
    xxxxxxx xxxx xxxx xxxx xxx xxxxx xxxxx, xx xxxxxxxxxxxx xxxxxx xxxxxxx, x xxxxxx xxx xx xxxxxx. xxxxxxx xxx-xx xxxxxxxxxxx xxxxx "xxxxxxxx, x xxxx xxx xxxxxxxxx" xxx xxxx xxxxx xxxxxxxx "xx xxxx xxxxx x xxxxx". xx xxxx xxx xxxxxxxxx xxxxx xx xxxxxxxxxxx xxxxxxxx, x xxxxxxxxxxxxx xx xxxx x xxx xxxxxxxxx xxxxxxxxx.

    html2md-0.2.15/test-samples/dybr-bug-with-strong-inside-link.html000064400000000000000000000017221046102023000227730ustar 00000000000000
    Just God (xxx)
    xxxxx: natoth
    xxxxxx: xxxx - xxxxxxxx xxxxxx
    xxxxxxxxx: xxxx/xxxx
    xxxxxxx: R
    xxxx: Angst/Drama/Fantasy
    xxxxxx: xxxx | 1 973 xxxxx
    xxxxxx: xxxxxxxx
    xxxxxxx: xxxxxxxx xxxxxx
    xxxxxxxxxxxxxx: xxxxxxx
    xxxxxxx: xxxx, xxxxxxxxx xxxxxxx, xxxxxxxx xxx xxxxxxx xxxxx xxxxx x xxxxxxxxx x xxxx, x xxxxx xxxxxxxx, xxx xxxxxx - xxxx xxxxxxxxx...

    xxxxxx xx xxxxxxx

    html2md-0.2.15/test-samples/dybr-bug-with-tables-2-masked.html000064400000000000000000000013521046102023000221250ustar 00000000000000
    xxxxx xxxxxxxxxx xxxxxxx x xxxxx)) xxxxxxxx xxxxxxxx

    At a Glance

     
    Current Conditions: Open all year. No reservations. No services. 
    Reservations: No reservations. 
    Fees No fee. 
    Water: No water.

    html2md-0.2.15/test-samples/dybr-bug-with-tables-masked.html000064400000000000000000000046361046102023000217760ustar 00000000000000
    Maybe I'm foolish, maybe I'm blind
    Thinking I can see through this and see what's behind
    Got no way to prove it so maybe I'm blind

    But I'm only human after all,
    I'm only human after all
    Don't put your blame on me
    xxxxx xxxx, x xxxxxx, xxxxx xxxx — xxxxxx
    xxx xxxxx, xxx xxxx xxxxxx xxxxxx xxx, x xxxxxx xxx xxx xx xxx
    xxxx x xxxx xx xxxx xxxxxxx xxxxxxxxxxxxx, xxx xxx xxxxxxxx, x xxxxxx.

    xx x xxxxx xxxx xxxxxxx, x xxxxx-xx xxxxxx,
    x xxxxx xxxx xxxxxxx, x xxxxx xxxxxx.
    xx xxxx xxxx

    xxxxxx xxxxx xxxxx x xxxxxxx

    x xxxx xxxxxxxxx xxxxxxx xxxxxxxxxxx xx xxxx xxxxx. x xxxxx xxxxxxx, xxxx xxxxx xxxxxxx xx xxxxxxxxxx xxxxxx. xxx xxxxxxxx, xxx xxxxxxxxx xxxxxxxxxxxxxx xx xxxxx — xxxxxxxxxx xxxxxxxxxx x xxxxx xxxxxxxxxxxxx xxxxxxxxx. x xxx xxxxxxxxxxxx xxxx, xxxxxx xxxx, xxxxxxxxxx xxxxx xxxxxxxx, xxxxxxxxxx x xxxxxxxxx. xx xxxxxx xxxxx xxxxxxxxxxxxxxxxx — x xxxxxx xxx xxxx.

    xxxxx xxxxxxxxxx xxxxx x xxxx xxxxxxxxxx xxxxx. xxxxx. x xxxxx: «x xxxxxx xxxxxxx, x xxxxx xxx xxxx, xx xxxxxxxx xxxxxx», — xxx xxxxx xxxxxxxx. xxxxxx xxx x xxxx xxxx xxxxxxxx xxxxxxxx xxxxxxx xxxx xxxxxxxxxxx xxxxxxxxxx, xxxxxxx xxxxxx xxxxxx xxx xxxxx, xxxxxxxxxxx x x xxxxxxx xxxxxxxxx.

    xx x xxxxx xxxx xxxxxxx. xxxxxx xxxxx? xxxxxxxxxxx x xxxxxxxxx xxxxxx.

    x xxxxx x xxxxxxxxxx x xxxxx... x xxxxxx xxxx xxxxxx xxxxxxx xxxxxxxx. xx xxxx, x xxxxxx xxx-xx xxxxxxxxx xx xxxxxxx, xxx xxxxxx xxxxxx, xxx xxx xxxxx, xxxxx xxxxxxxx xx xxxx... x xxxxxx xxxxxxx xx xxxx xxxxx, xxx, xxxxx xxxx xxxxxxxxxx, x xxxxx xxxxxxxxx xx xxxxx. x xxx-xx xxx xxxxx xxxxxxx xxxxxxxxxxxxx.

    xxxxxx xx... xx xxx xx xxxxxxxxxxxxx xxxxxx xxxxxxxxxxxxx x xxxxxxxxxx xxxxx, xxxxx xxx xxxx xxxxxxxxx, x xxxxx xxx xxxxxxxxx, xxx xxxxxxx xxx, xxx xxxx xxxxxxx xxxxxx, x xx xxx, xxx xxxx xxxxxxxx.
    html2md-0.2.15/test-samples/dybr-test.html000064400000000000000000000073541046102023000165150ustar 00000000000000
         xxxxxxxxx xxxxx  xxxxxxxx  xxxxx  xxxxxxxxxx,  xxxxxxxxxxx  xx  xxxxx
        xxxxx, xxxxxxx xxxxx xxxxx xxx xx xxxxx. xxxxx xx xxxx xxx x xxxxxx xxxxxx
        xxx, xxxxxxxxxx x xxxxx.
             - xxxxxxxx!!! - xxxxxxx x.
             xx xxxxxxxxxx x xxxxxx.
             - xxx xx xxxxxxxxxxx xxxxxxx x xxxx?..
             - xxxxxx, - xxxxxxx xxxxxxx xx.
             - xxxxx xxx... xxxxxxxxx. xxxxxxxxx... xxxxx xxxxxx xxx xxxxxxxx...
             xxxxxxx, xx xxxx, xxx xx xx xxxxxxx xxxxxx, xx xxxxxxx  xxxxxxxxx  xx
        xxx xxxxxxxxx. xx xxxx xxxxxxxxx xxxxx, xxxxxxx x xxxx x  xxx  xxxxxxx.  x
        xxxxxx xxx xxx xxxxxxx. xxxx, xxx xxxx xxxxxxx xxx xxxx,  xx  xxxxxx  xxxx
        xxxxx x xxxxx, xxx xx xxxxxxxx xxxx x xxx, xxx xx xxxxxxx, xxxx xx  xxxxxx
        xx xxxxx xxxxxxxxx, - xxx  xx  x  xxxxxxx  xxxx  xxxx...  x  xxxxxxx  xxxx
        xxxxxxxx  xxxxxxxxxxxx  xxxx  xxxxx  xxxxx  xxxxx  x  xxx,  xxx  xxx-xx  x
        xxxxxxxxxx xxxxxx  xxxxx,  x  xxxxxxx  xxxxxxxxx  xxxxxxxx  xxxxx  xxxxxxx
        xxxxxxxxxxxx xxxxx, xxxxx xxxxxxxx xxxx  xxxxxxxxxx  xxxxxxx,  xxxxxxx  xx
        xxxx. x, xxxxxx  xxxxx  xxxx,  xxxxxx  xx  xxxxx  xxxxxxx  xxxxx  xx  xxxx
        xxxxxxxxx, xxxxxxxx, xxxxx xxxxxxx, xxx  xxxxxx  xxxxxxxxxxx  xxxx  xxxxxx
        xxxxxxx, xxx xxxxxxxx x xxxxxxx xxxxxxxxxxx. xxxxxxxxxx, xxxxxxxxxx, xxxxx
        xxxxxxxxx xxxxx, xxxx x xxxxxx  xxxxxxx,  xxxxx  xx  xxx  xx  xxxxxxxx  xx
        xxxxxxx xxxxx, xx xxxxxxx xxxx xxxxx  x,  xxxxxx  xxxxx,  xxxxxxxxxx,  xxx
        xxxxxx xxxxxxxxx xxx, x xxxxxxxxxxxxx xxxxxxxxxxx, xxxx x  xxxx,  x  xxxxx
        xxxx, xxx xxx xxxxx xxxxxxxxx xxxxxx, xxxx xxxxxxx xx  xx  xxxxxxx  xxxxx,
        xxxxxx, xxxx x xxxx, xxxxxx x xxxxx xxxxxx xxxxxxxxxxxxxx xxxxxxxxx  xxxxx
        xxxxxxxxx, xxxxxxx xxxxx xxx xxxx xxxxxxxxxxx  xxxxxxxx  x  xxxxxxx.  xxxx
        xxxxx x xxx, xxx xxxxxxxxxx xxx-xx, xxxxxxxxxx xx xxxxx xxxx, x  xxxxxxxxx
        x xxxxxxx xxxxxxx xxxxxx xxxxxxx xxxxx xxxx xxxxxxx xxxxxx, xxxx  xxxx  xx
        xxxxx  xxxxx  xxxxxxxx  xxxxx.  xxxx  xxx,  xxxxxxxxxx,   xx   xxxxxxxxxxx
        xxxxxxxxx; xxxxx, xxxxxxx xx xxxxxxxxxx... xxxxxxx, xxxxxxxxxxx xx x xxxx.
        xxxxx xxxxxx: xxx xxxxx x xxxxxx xxxx xxxxxxx. xxxx xxxxxx x xxxxxxxxxx  x
        xxxx, xx xxxxxx, xxx... xx, xxxxx. xxxxx xxxxxx: xxx xxxx xxx, xxx x xxxxx
        xxxx. xxxxxx... xx. xx xxxxxxx xx... x xxxxx xx xxxxxx, xxxxx  x  xxxxxxxx
        xxxx.  x  x  xxxxxxx  xxxxxxxxxxx   xxxxxxxxxxx   xxxxxx,   x   xxxxx   xx
        xxxxxxxxxxxxxxx xxxxxxx xxxx (xxx xxxx xxxxxxxxx xx xxxx) xxxxxx xxxxxx xx
        xxxxxxxxxxx xxxxxx; xx xxxxx xxxxxxxx... xx xxxx xxxx  xxx  x  xxxxxxxx  x
        xxxxxxx, xxx xxxxxxxx, xxx xx xxx xxxxxxx xxxxxxxx xxxxxxxxxxx  xxxxxx,  x
        xxxxx, xxxxxxxx xx xxxxxx xxxxxxxx, xxxxxxxxxxx xxxxxxx, x xxxxxxxx,  xxxx
        xxxxxxxxxxxxxx xxxxxxx xxxxxxx x xxxxx xxxxxx...
             xxxxxx, x xxxxx, xxx xx xxx-xxxx xxxxxxx xxxxxx. xxxxxx xxx, xxxxx  x
        xxxxxx, xxxx x xxxx xxxxxxxx, x  x  xxxx  xxxxxx  xxxxxxxx  xxxx  xxxxxxxx
        xxxxxxxx... x xxx xx xxxxx xx xxxxx. x  xxxxx-xx  xxxx  xxxxxxx  xxxxxxxxx
        xxxxx xxxxx xxxxxxxxxxxxxxxxx  xxxxxxxxxxx  xxxxxx  -  xxxxxxx  xxxxxxxxxx
        xxxxxxx xxxxxxxxxxx x xxxxx x  xxxxxxx.  xxxxx  xxxxxx  xxxxx.  x  x  xxxx
        xxxxxxx xxxxxxxx xxxxxxx xxxxxx xxxxx xxxxxxxxx:
             - xx xxxx xxxxxx, xxxxx... xxx xxxxxxx.
             xx xxxx xxx x xxxx xx  xxx,  x  xxxxx  x  xxxxxxx,  xxxxxx  xxx  xxxx
        xxxxxxxxx xxxxxxxxxxxxxx, xx  x  xxxxx  xxxxxx  xxxxxxxxxx  xxx.  xx  xxxx
        xxxxxxxxx, xxx. x xxxxxxx  xxxxxxxxxxxx,  xx  xxxxxxxxxx  xxxx  xx  xxxxx,
        xxxxxxxxxxx xxxxx xxxxxx, xxx xxx xxxxxxxxxxx. xx  xxxxx  xxxxxxx  xxxxxxx
        xxxxxxx xxxxxxx x xxxxx, xx xxxxxx xx xxxxx xx xxxxxxxx.
             xxxxxx xxxxxxx x xx xxxxx xxx, x xx xxxx, xxx x  xxx  xxxxxxxxx  -  x
        xxxx xxxxxxxxxxxxx xxxxxxxxxxx xxxx, xxxxxxx x xxxx.
    html2md-0.2.15/test-samples/holywarsoo-text-with-spoilers-masked.html000064400000000000000000001444701046102023000240420ustar 00000000000000

    xx: https://vk.com/aboutdybr
    xxxxxxx: https://www.patreon.com/simoroshka
    xxxxxx: https://trello.com/b/nj07E2lz/%D0%B4%D1 … 1%80%D1%83
    xxxx xxxxxxxxx xx xxxxx: http://reflexive.some-social-website.gb/p214048733.htm
    xxx xxxxxx xx xxxxx: http://reflexive.some-social-website.gb/?tag=5559954
    x-xxxx xxxxxxxxx: simoroshka@dybr.ru
    xxxxxxxxxxx xx xxxxx: http://nicknames.dybr.ru/
    xxxxxx xxxxx: https://nicknames.dybr.ru/reserve
    xxxxxxxxxxxx: https://vk.com/wall-137986293_354
    xxxxx xxxxx xx xxxxx (xxx xxxxxxxxxxxxx xxxxx xxxxx xx xxxxxxxx xx xxxxxxxxxxxxxxxxxxxx): https://dybr.ru/feed

    xxx xxxxxxxxxxxxx xxxxxxx xxxxxx xx support@dybr.ru xxx x xxxxxxxxxxxxxxx xxxxxxxxxx x xxxxxx xxxxxxxxx:  https://vk.com/topic-137986293_36602740

    xxxxxxxxxxxx xxxxxxxx

    xxxxxx xxxxx

    xxxxxxx xxxxx

    y6-9dAg-aHc.jpg

    xxxxxx xxxxxxx

    xxxxxxx xxxxxx xxxxxxxx

    00edd43c-db93-4efc-87c7-b1f26f419ac9.jpg

    xxxxxxx xxxxxxxx xx xxxx xx xxxxxxx

    3f4dc1e6-8012-40fb-bb61-513549047ee9.jpg

    xxxxxxx xxxx. xxx xxxx xxxx xxxxxx xxxxx xxxxx xxxxx xxx.

    05d3c0c4-0c0e-48a1-a32d-00a9fb6a87a5.jpg

    xxxxx-xxxxxxx xxxxxxxx xxxxxxx

    90766f28-d63c-4f4d-af18-a4fbc9d2d940.jpg

    xxxxxxxxxx xxxxx xxxxxx

    614e2270-1aaf-4695-b547-aabb0fec97b1.jpg

    xxxxxx xxxxxxxxxxx, xxxxxxx xxxxxx, xxxxxxxxxxx x xxxxxxxxxxxxxx

    xxxxxxx xxxxx

    GZd7ebI.jpg

    xxxxxx xxxxxxxxxxx, xxxxxxx xxxxxx

    xxxxxxx xxxxx

    0fNZW.jpg

    xxxxx xxx xxxx xx "xxxxxxxxxxxx xx xxxxx", xxxxxx, xxxxxxx xxx

    https://another-social-website.us/viewtopic.php?pi … 0#p2631680 - https://another-social-website.us/viewtopic.php?pi … 2#p2631752
    https://another-social-website.us/viewtopic.php?pi … 6#p2631806, https://another-social-website.us/viewtopic.php?pi … 7#p2631807
    https://another-social-website.us/viewtopic.php?pi … 7#p2631857, https://another-social-website.us/viewtopic.php?pi … 3#p2631863
    https://another-social-website.us/viewtopic.php?pi … 5#p2676525, https://another-social-website.us/viewtopic.php?pi … 1#p2676531, https://another-social-website.us/viewtopic.php?pi … 7#p2676537, https://another-social-website.us/viewtopic.php?pi … 8#p2676558
    https://another-social-website.us/viewtopic.php?pi … 0#p2679840
    https://another-social-website.us/viewtopic.php?pi … 8#p2681248, https://another-social-website.us/viewtopic.php?pi … 5#p2681325 - https://another-social-website.us/viewtopic.php?pi … 0#p2681360, https://another-social-website.us/viewtopic.php?pi … 7#p2683707
    https://another-social-website.us/viewtopic.php?pi … 2#p2695032, https://another-social-website.us/viewtopic.php?pi … 9#p2695149, https://another-social-website.us/viewtopic.php?pi … 9#p2695169, https://another-social-website.us/viewtopic.php?pi … 9#p2695179
    https://another-social-website.us/viewtopic.php?pi … 8#p2696508, https://another-social-website.us/viewtopic.php?pi … 0#p2696530, https://another-social-website.us/viewtopic.php?pi … 6#p2696556 - https://another-social-website.us/viewtopic.php?pi … 0#p2696580, https://another-social-website.us/viewtopic.php?pi … 5#p2696595, https://another-social-website.us/viewtopic.php?pi … 0#p2696600, https://another-social-website.us/viewtopic.php?pi … 3#p2696613, https://another-social-website.us/viewtopic.php?pi … 1#p2696621, https://another-social-website.us/viewtopic.php?pi … 6#p2696676, https://another-social-website.us/viewtopic.php?pi … 2#p2696722
    https://another-social-website.us/viewtopic.php?pi … 1#p2700271, https://another-social-website.us/viewtopic.php?pi … 2#p2700282, https://another-social-website.us/viewtopic.php?pi … 5#p2700295, https://another-social-website.us/viewtopic.php?pi … 3#p2700303, https://another-social-website.us/viewtopic.php?pi … 5#p2700315, https://another-social-website.us/viewtopic.php?pi … 7#p2700317, https://another-social-website.us/viewtopic.php?pi … 0#p2700330, https://another-social-website.us/viewtopic.php?pi … 4#p2700334, https://another-social-website.us/viewtopic.php?pi … 2#p2700342, https://another-social-website.us/viewtopic.php?pi … 0#p2700350, https://another-social-website.us/viewtopic.php?pi … 8#p2700358, https://another-social-website.us/viewtopic.php?pi … 0#p2700380, https://another-social-website.us/viewtopic.php?pi … 4#p2700384 - https://another-social-website.us/viewtopic.php?pi … 3#p2700393, https://another-social-website.us/viewtopic.php?pi … 8#p2700398, https://another-social-website.us/viewtopic.php?pi … 5#p2700405, https://another-social-website.us/viewtopic.php?pi … 1#p2700521, https://another-social-website.us/viewtopic.php?pi … 0#p2700530, https://another-social-website.us/viewtopic.php?pi … 6#p2700536, https://another-social-website.us/viewtopic.php?pi … 5#p2700545
    https://another-social-website.us/viewtopic.php?pi … 7#p2708197
    https://another-social-website.us/viewtopic.php?pi … 3#p2710533, https://another-social-website.us/viewtopic.php?pi … 4#p2710554, https://another-social-website.us/viewtopic.php?pi … 7#p2710567, https://another-social-website.us/viewtopic.php?pi … 7#p2710597, https://another-social-website.us/viewtopic.php?pi … 8#p2710608, https://another-social-website.us/viewtopic.php?pi … 0#p2710620, https://another-social-website.us/viewtopic.php?pi … 8#p2710628, https://another-social-website.us/viewtopic.php?pi … 0#p2710660, https://another-social-website.us/viewtopic.php?pi … 5#p2710765, https://another-social-website.us/viewtopic.php?pi … 6#p2710836
    https://another-social-website.us/viewtopic.php?pi … 6#p2715586, https://another-social-website.us/viewtopic.php?pi … 5#p2715605
    https://another-social-website.us/viewtopic.php?pi … 1#p2720651 - https://another-social-website.us/viewtopic.php?pi … 1#p2720671, https://another-social-website.us/viewtopic.php?pi … 4#p2720674 - https://another-social-website.us/viewtopic.php?pi … 5#p2720695, https://another-social-website.us/viewtopic.php?pi … 3#p2720713 - https://another-social-website.us/viewtopic.php?pi … 7#p2720807, https://another-social-website.us/viewtopic.php?pi … 1#p2720971, https://another-social-website.us/viewtopic.php?pi … 7#p2720977
    https://another-social-website.us/viewtopic.php?pi … 5#p2721135, https://another-social-website.us/viewtopic.php?pi … 3#p2721143, https://another-social-website.us/viewtopic.php?pi … 3#p2721163 - https://another-social-website.us/viewtopic.php?pi … 2#p2721182, https://another-social-website.us/viewtopic.php?pi … 8#p2721188, https://another-social-website.us/viewtopic.php?pi … 0#p2721200 - https://another-social-website.us/viewtopic.php?pi … 4#p2721234, https://another-social-website.us/viewtopic.php?pi … 0#p2721260 - https://another-social-website.us/viewtopic.php?pi … 4#p2721314, https://another-social-website.us/viewtopic.php?pi … 7#p2721317 - https://another-social-website.us/viewtopic.php?pi … 5#p2721365
    https://another-social-website.us/viewtopic.php?pi … 4#p2727554, https://another-social-website.us/viewtopic.php?pi … 9#p2727589, https://another-social-website.us/viewtopic.php?pi … 5#p2727595 - https://another-social-website.us/viewtopic.php?pi … 0#p2727610, https://another-social-website.us/viewtopic.php?pi … 6#p2727626 - https://another-social-website.us/viewtopic.php?pi … 9#p2727739, https://another-social-website.us/viewtopic.php?pi … 4#p2727854, https://another-social-website.us/viewtopic.php?pi … 9#p2727959, https://another-social-website.us/viewtopic.php?pi … 3#p2727963, https://another-social-website.us/viewtopic.php?pi … 6#p2728006 - https://another-social-website.us/viewtopic.php?pi … 9#p2728029, https://another-social-website.us/viewtopic.php?pi … 7#p2728037 - https://another-social-website.us/viewtopic.php?pi … 9#p2728049, https://another-social-website.us/viewtopic.php?pi … 5#p2728075, https://another-social-website.us/viewtopic.php?pi … 6#p2728116
    https://another-social-website.us/viewtopic.php?pi … 6#p2728296, https://another-social-website.us/viewtopic.php?pi … 8#p2728428, https://another-social-website.us/viewtopic.php?pi … 4#p2728444, https://another-social-website.us/viewtopic.php?pi … 3#p2728463, https://another-social-website.us/viewtopic.php?pi … 7#p2728467, https://another-social-website.us/viewtopic.php?pi … 5#p2728475, https://another-social-website.us/viewtopic.php?pi … 6#p2728486, https://another-social-website.us/viewtopic.php?pi … 0#p2728520, https://another-social-website.us/viewtopic.php?pi … 8#p2728528, https://another-social-website.us/viewtopic.php?pi … 5#p2728535, https://another-social-website.us/viewtopic.php?pi … 0#p2728540, https://another-social-website.us/viewtopic.php?pi … 8#p2728558, https://another-social-website.us/viewtopic.php?pi … 0#p2728600
    https://another-social-website.us/viewtopic.php?pi … 5#p2728685, https://another-social-website.us/viewtopic.php?pi … 7#p2728687
    https://another-social-website.us/viewtopic.php?pi … 4#p2730924, https://another-social-website.us/viewtopic.php?pi … 5#p2730935 - https://another-social-website.us/viewtopic.php?pi … 9#p2730989
    https://another-social-website.us/viewtopic.php?pi … 7#p2732017, https://another-social-website.us/viewtopic.php?pi … 2#p2732032 - https://another-social-website.us/viewtopic.php?pi … 5#p2732055
    https://another-social-website.us/viewtopic.php?pi … 7#p2746457, https://another-social-website.us/viewtopic.php?pi … 2#p2746462
    https://another-social-website.us/viewtopic.php?pi … 5#p2748085, https://another-social-website.us/viewtopic.php?pi … 9#p2748119
    https://another-social-website.us/viewtopic.php?pi … 3#p2751213, https://another-social-website.us/viewtopic.php?pi … 8#p2751278 - https://another-social-website.us/viewtopic.php?pi … 7#p2751637, https://another-social-website.us/viewtopic.php?pi … 4#p2751714 - https://another-social-website.us/viewtopic.php?pi … 4#p2751754, https://another-social-website.us/viewtopic.php?pi … 7#p2751767, https://another-social-website.us/viewtopic.php?pi … 1#p2751921, https://another-social-website.us/viewtopic.php?pi … 5#p2751925, https://another-social-website.us/viewtopic.php?pi … 2#p2751942, https://another-social-website.us/viewtopic.php?pi … 8#p2751978, https://another-social-website.us/viewtopic.php?pi … 3#p2751983 - https://another-social-website.us/viewtopic.php?pi … 8#p2752008
    https://another-social-website.us/viewtopic.php?pi … 0#p2756510, https://another-social-website.us/viewtopic.php?pi … 7#p2756517, https://another-social-website.us/viewtopic.php?pi … 8#p2756518, https://another-social-website.us/viewtopic.php?pi … 4#p2756524, https://another-social-website.us/viewtopic.php?pi … 1#p2756531, https://another-social-website.us/viewtopic.php?pi … 9#p2756539, https://another-social-website.us/viewtopic.php?pi … 4#p2756544 - https://another-social-website.us/viewtopic.php?pi … 3#p2756553, https://another-social-website.us/viewtopic.php?pi … 7#p2756557, https://another-social-website.us/viewtopic.php?pi … 6#p2756676, https://another-social-website.us/viewtopic.php?pi … 5#p2756725, https://another-social-website.us/viewtopic.php?pi … 3#p2756733, https://another-social-website.us/viewtopic.php?pi … 1#p2756741 - https://another-social-website.us/viewtopic.php?pi … 1#p2756811, https://another-social-website.us/viewtopic.php?pi … 7#p2756827, https://another-social-website.us/viewtopic.php?pi … 1#p2756931, https://another-social-website.us/viewtopic.php?pi … 1#p2757391, https://another-social-website.us/viewtopic.php?pi … 4#p2757394, https://another-social-website.us/viewtopic.php?pi … 5#p2757425, https://another-social-website.us/viewtopic.php?pi … 5#p2757445, https://another-social-website.us/viewtopic.php?pi … 3#p2757463, https://another-social-website.us/viewtopic.php?pi … 7#p2757487, https://another-social-website.us/viewtopic.php?pi … 5#p2757645, https://another-social-website.us/viewtopic.php?pi … 4#p2757654 - https://another-social-website.us/viewtopic.php?pi … 6#p2757706, https://another-social-website.us/viewtopic.php?pi … 8#p2757728 - https://another-social-website.us/viewtopic.php?pi … 1#p2757741, https://another-social-website.us/viewtopic.php?pi … 7#p2759587 - https://another-social-website.us/viewtopic.php?pi … 1#p2759701, https://another-social-website.us/viewtopic.php?pi … 6#p2759706, https://another-social-website.us/viewtopic.php?pi … 5#p2759785, https://another-social-website.us/viewtopic.php?pi … 7#p2759837, https://another-social-website.us/viewtopic.php?pi … 1#p2759961, https://another-social-website.us/viewtopic.php?pi … 0#p2761860, https://another-social-website.us/viewtopic.php?pi … 0#p2761860, https://another-social-website.us/viewtopic.php?pi … 9#p2763869, https://another-social-website.us/viewtopic.php?pi … 7#p2763887, https://another-social-website.us/viewtopic.php?pi … 5#p2769635, https://another-social-website.us/viewtopic.php?pi … 2#p2769652, https://another-social-website.us/viewtopic.php?pi … 9#p2769689, https://another-social-website.us/viewtopic.php?pi … 7#p2769887
    https://another-social-website.us/viewtopic.php?pi … 0#p2771490 - https://another-social-website.us/viewtopic.php?pi … 5#p2771705, https://another-social-website.us/viewtopic.php?pi … 7#p2771757, https://another-social-website.us/viewtopic.php?pi … 3#p2771763 - https://another-social-website.us/viewtopic.php?pi … 2#p2771772, https://another-social-website.us/viewtopic.php?pi … 6#p2771776 - https://another-social-website.us/viewtopic.php?pi … 4#p2771834, https://another-social-website.us/viewtopic.php?pi … 5#p2771845 - https://another-social-website.us/viewtopic.php?pi … 4#p2771874, https://another-social-website.us/viewtopic.php?pi … 7#p2771907 - https://another-social-website.us/viewtopic.php?pi … 1#p2771941, https://another-social-website.us/viewtopic.php?pi … 9#p2771959 - https://another-social-website.us/viewtopic.php?pi … 8#p2771988, https://another-social-website.us/viewtopic.php?pi … 1#p2771991 - https://another-social-website.us/viewtopic.php?pi … 2#p2772112, https://another-social-website.us/viewtopic.php?pi … 6#p2772116, https://another-social-website.us/viewtopic.php?pi … 8#p2772188, https://another-social-website.us/viewtopic.php?pi … 2#p2772352 - https://another-social-website.us/viewtopic.php?pi … 2#p2772482, https://another-social-website.us/viewtopic.php?pi … 0#p2772550, https://another-social-website.us/viewtopic.php?pi … 5#p2772605, https://another-social-website.us/viewtopic.php?pi … 5#p2772815 - https://another-social-website.us/viewtopic.php?pi … 4#p2772894, https://another-social-website.us/viewtopic.php?pi … 7#p2772927 - https://another-social-website.us/viewtopic.php?pi … 2#p2773102
    https://another-social-website.us/viewtopic.php?pi … 1#p2794251, https://another-social-website.us/viewtopic.php?pi … 7#p2794277 - https://another-social-website.us/viewtopic.php?pi … 5#p2794325, https://another-social-website.us/viewtopic.php?pi … 9#p2794359 - https://another-social-website.us/viewtopic.php?pi … 4#p2794414, https://another-social-website.us/viewtopic.php?pi … 6#p2794456

    html2md-0.2.15/test-samples/marcfs-readme.html000064400000000000000000000226151046102023000173030ustar 00000000000000

    Gitter
    build status
    License

    MARC-FS

    Mail.ru Cloud filesystem written for FUSE

    Synopsis

    This is an implementation of a simple filesystem with all calls and hooks needed for normal file operations. After mounting it you’ll be provided access to all your cloud files remotely stored on Mail.ru Cloud as if they were local ones. You should keep in mind that this is a network-driven FS and so it will never be as fast as any local one, but having a folder connected as remote drive in 9P/GNU Hurd fashion can be convenient at a times.

    Bear in mind that this project is still in its infancy, sudden errors/crashes/memory leaks may occur.

    Features

    • cloud storage is represented as local folder
    • rm, cp, ls, rmdir, touch, grep and so on are working
    • filesystem stats are working, can check with df
    • multithreaded, you can work with multiple files at once
    • support for files > 2GB by seamless splitting/joining uploaded/downloaded files

    Installation & Usage

    You should have cmake and g++ with C++14 support at hand.
    MARC-FS also requires libfuse (obviously), libcurl (min 7.34) and pthread libraries. Once you have all this, do as usual:

    $ git clone --recursive https://gitlab.com/Kanedias/MARC-FS.git
    $ cd MARC-FS
    $ mkdir build
    $ cd build && cmake ..
    $ make
    $ # here goes the step where you actually go and register on mail.ru website to obtain cloud storage and auth info
    $ ./marcfs /path/to/mount/folder -o username=your.email@mail.ru,password=your.password,cachedir=/path/to/cache
    

    If you want your files on Mail.ru Cloud to be encrypted, you may use nested EncFS filesystem to achieve this:

    $ ./marcfs /path/to/mount/folder -o username=your.email@mail.ru,password=your.password
    $ mkdir /path/to/mount/folder/encrypted # needed only once when you init your EncFS
    $ encfs --no-default-flags /path/to/mount/folder/encrypted /path/to/decrypted/dir
    $ cp whatever /path/to/decrypted/dir
    $ # at this point encrypted data will appear in Cloud Mail.ru storage
    

    If you want to use rsync to synchronize local and remote sides, use --sizes-only option.
    Rsync compares mtime and size of file by default, but Mail.ru Cloud saves only seconds in mtime,
    which causes false-positives and reuploads of identical files:

    $ rsync -av --delete --size-only /path/to/local/folder/ ~/path/to/mount/folder
    

    To unmount previously mounted share, make sure no one uses it and execute:

    $ # if you mounted encfs previously, first unmount it
    $ # fusermount -u /path/to/mount/folder/encrypted
    $ fusermount -u /path/to/mount/folder
    

    If you want to get a shared link to the file, you should create a file with special name, *.marcfs-link

    $ # suppose we want to get a public link to file 'picture.png'
    $ touch picture.png.marcfs-link
    $ cat picture.png.marcfs-link
    /path/to/file/pictire.png: https://cloud.mail.ru/public/LINK/ADDRESS
    

    Files with size > 2G will show up as series of shared links for each part.
    After getting the link special file can be safely removed.

    Notes

    External configuration

    If you don’t want to type credentials on the command line you can use config file for that.
    The file is ~/.config/marcfs/config.json (default XDG basedir spec).
    You can override its’ location via -o conffile=/path/to/config option. Example config:

    {
        "username": "user@mail.ru",
        "password": "password",
        "cachedir": "/absolute/path"
        "proxyurl": "http://localhost:3128"
    }
    

    Cache dir

    MARC-FS has two modes of operation. If no cachedir option is given, it stores all intermediate download/upload
    data directly in memory. If you copy large files as HD movies or ISO files, it may eat up your RAM pretty quickly,
    so be careful. This one is useful if you want to copy your photo library to/from the cloud - this will actually take
    a lot less time than with second option.

    If cachedir option is given, MARC-FS stores all intermediate data there. It means, all files that are currently open
    in some process, copied/read or being edited - will have their data stored in this dir. This may sound like plenty
    of space, but most software execute file operations sequentally, so in case of copying large media library on/from
    the cloud you won’t need more free space than largest one of the files occupies.

    API references

    Motivation

    Mail.ru is one of largest Russian social networks. It provides mail services, hosting, gaming platforms and, incidentally, cloud services, similar to Dropbox, NextCloud etc.

    Once upon a time Mail.ru did a discount for this cloud solution and provided beta testers (and your humble servant among them) with free 1 TiB storage.

    And so… A holy place is never empty.

    Bugs & Known issues

    1. Temporary
    • SOme issues may arise if you delete/move file that is currently copied or read. Please report such bugs here.
    • big memory footprint due to
      • SSL engine sessions - tend to become bigger with time (WIP)
      • heap fragmentation (WIP)
      • MADV_FREE - lazy memory reclaiming in Linux > 4.5 (not a bug actually)
    • On RHEL-based distros (CentOS/Fedora) you may need NSS_STRICT_NOFORK=DISABLED environment variable (see this and this)
    1. Principal (Mail.ru Cloud API limitations)
    • No extended attr/chmod support, all files on storage are owned by you
    • No atime/ctime support, only mtime is stored
    • No mtime support for directories, expect all of them to have Jan 1 1970 date in ls
    • No Transfer-Encoding: chunked support for POST requests in cloud nginx (chunkin on/proxy_request_buffering options in nginx/tengine config), so files are read fully into memory before uploading

    Contributions

    You may create merge request or bug/enhancement issue right here on GitLab, or send formatted patch via e-mail. For details see CONTRIBUTING.md file in this repo.
    Audits from code style and security standpoint are also much appreciated.

    License

    Copyright (C) 2016-2017  Oleg `Kanedias` Chernovskiy
    
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    
    html2md-0.2.15/test-samples/marcfs-readme.md000064400000000000000000000167561046102023000167500ustar 00000000000000[![Gitter](https://img.shields.io/gitter/room/MARC-FS/MARC-FS.svg)](https://gitter.im/MARC-FS/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![build status](https://gitlab.com/Kanedias/MARC-FS/badges/master/build.svg)](https://gitlab.com/Kanedias/MARC-FS/commits/master) [![License](https://img.shields.io/aur/license/marcfs-git.svg)](https://www.gnu.org/licenses/gpl-3.0.html) MARC-FS =========== Mail.ru Cloud filesystem written for FUSE Synopsis -------- This is an implementation of a simple filesystem with all calls and hooks needed for normal file operations. After mounting it you'll be provided access to all your cloud files remotely stored on Mail.ru Cloud as if they were local ones. You should keep in mind that this is a network-driven FS and so it will never be as fast as any local one, but having a folder connected as remote drive in 9P/GNU Hurd fashion can be convenient at a times. **Bear in mind that this project is still in its infancy, sudden errors/crashes/memory leaks may occur.** Features -------- - cloud storage is represented as local folder - `rm`, `cp`, `ls`, `rmdir`, `touch`, `grep` and so on are working - filesystem stats are working, can check with `df` - multithreaded, you can work with multiple files at once - support for files > 2GB by seamless splitting/joining uploaded/downloaded files Installation & Usage -------------------- You should have cmake and g++ with C++14 support at hand. MARC-FS also requires `libfuse` (obviously), `libcurl` (min 7.34) and `pthread` libraries. Once you have all this, do as usual: $ git clone --recursive https://gitlab.com/Kanedias/MARC-FS.git $ cd MARC-FS $ mkdir build $ cd build && cmake .. $ make $ # here goes the step where you actually go and register on mail.ru website to obtain cloud storage and auth info $ ./marcfs /path/to/mount/folder -o username=your.email@mail.ru,password=your.password,cachedir=/path/to/cache If you want your files on Mail.ru Cloud to be encrypted, you may use nested EncFS filesystem to achieve this: $ ./marcfs /path/to/mount/folder -o username=your.email@mail.ru,password=your.password $ mkdir /path/to/mount/folder/encrypted # needed only once when you init your EncFS $ encfs --no-default-flags /path/to/mount/folder/encrypted /path/to/decrypted/dir $ cp whatever /path/to/decrypted/dir $ # at this point encrypted data will appear in Cloud Mail.ru storage If you want to use rsync to synchronize local and remote sides, use `--sizes-only` option. Rsync compares mtime and size of file by default, but Mail.ru Cloud saves only seconds in mtime, which causes false-positives and reuploads of identical files: $ rsync -av --delete --size-only /path/to/local/folder/ ~/path/to/mount/folder To unmount previously mounted share, make sure no one uses it and execute: $ # if you mounted encfs previously, first unmount it $ # fusermount -u /path/to/mount/folder/encrypted $ fusermount -u /path/to/mount/folder If you want to get a shared link to the file, you should create a file with special name, `*.marcfs-link` $ # suppose we want to get a public link to file 'picture.png' $ touch picture.png.marcfs-link $ cat picture.png.marcfs-link /path/to/file/pictire.png: https://cloud.mail.ru/public/LINK/ADDRESS Files with size > 2G will show up as series of shared links for each part. After getting the link special file can be safely removed. Notes ----- #### External configuration #### If you don't want to type credentials on the command line you can use config file for that. The file is `~/.config/marcfs/config.json` (default [XDG basedir spec](https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html)). You can override its' location via `-o conffile=/path/to/config` option. Example config: ```json { "username": "user@mail.ru", "password": "password", "cachedir": "/absolute/path" "proxyurl": "http://localhost:3128" } ``` #### Cache dir #### MARC-FS has two modes of operation. If no cachedir option is given, it stores all intermediate download/upload data directly in memory. If you copy large files as HD movies or ISO files, it may eat up your RAM pretty quickly, so be careful. This one is useful if you want to copy your photo library to/from the cloud - this will actually take a lot less time than with second option. If cachedir option is given, MARC-FS stores all intermediate data there. It means, all files that are currently open in some process, copied/read or being edited - will have their data stored in this dir. This may sound like plenty of space, but most software execute file operations sequentally, so in case of copying large media library on/from the cloud you won't need more free space than largest one of the files occupies. API references -------------- - There is no official Mail.ru Cloud API reference, everything is reverse-engineered. You may refer to [Doxygen API comments](https://gitlab.com/Kanedias/MARC-FS/blob/master/marc_api.h) to grasp concept of what's going on. - FUSE: [API overview](https://www.cs.hmc.edu/~geoff/classes/hmc.cs135.201109/homework/fuse/fuse_doc.html) - used to implement FS calls - cURL: [API overview](https://curl.haxx.se/docs/) - used to interact with Mail.ru Cloud REST API Motivation ---------- Mail.ru is one of largest Russian social networks. It provides mail services, hosting, gaming platforms and, incidentally, cloud services, similar to Dropbox, NextCloud etc. Once upon a time Mail.ru did a discount for this cloud solution and provided beta testers (and your humble servant among them) with free 1 TiB storage. And so... A holy place is never empty. Bugs & Known issues ------------------- 1. Temporary - SOme issues may arise if you delete/move file that is currently copied or read. Please report such bugs here. - big memory footprint due to - SSL engine sessions - tend to become bigger with time (WIP) - heap fragmentation (WIP) - MADV_FREE - lazy memory reclaiming in Linux > 4.5 (not a bug actually) - On RHEL-based distros (CentOS/Fedora) you may need `NSS_STRICT_NOFORK=DISABLED` environment variable (see [this](https://gitlab.com/Kanedias/MARC-FS/issues/6) and [this](https://bugzilla.redhat.com/show_bug.cgi?id=1317691)) 2. Principal (Mail.ru Cloud API limitations) - No extended attr/chmod support, all files on storage are owned by you - No atime/ctime support, only mtime is stored - No mtime support for directories, expect all of them to have `Jan 1 1970` date in `ls` - No `Transfer-Encoding: chunked` support for POST **requests** in cloud nginx (`chunkin on`/`proxy_request_buffering` options in `nginx`/`tengine` config), so files are read fully into memory before uploading Contributions ------------ You may create merge request or bug/enhancement issue right here on GitLab, or send formatted patch via e-mail. For details see CONTRIBUTING.md file in this repo. Audits from code style and security standpoint are also much appreciated. License ------- Copyright (C) 2016-2017 Oleg `Kanedias` Chernovskiy This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. html2md-0.2.15/test-samples/markdown-cheatsheet.html000064400000000000000000000342741046102023000205360ustar 00000000000000

    This is intended as a quick reference and showcase. For more complete info, see John Gruber’s original spec and the Github-flavored Markdown info page.

    Note that there is also a Cheatsheet specific to Markdown Here if that’s what you’re looking for. You can also check out more Markdown tools.

    Table of Contents

    Headers
    Emphasis
    Lists
    Links
    Images
    Code and Syntax Highlighting
    Tables
    Blockquotes
    Inline HTML
    Horizontal Rule
    Line Breaks
    YouTube Videos

    Headers

    # H1
    ## H2
    ### H3
    #### H4
    ##### H5
    ###### H6
    
    Alternatively, for H1 and H2, an underline-ish style:
    
    Alt-H1
    ======
    
    Alt-H2
    ------
    

    H1

    H2

    H3

    H4

    H5
    H6

    Alternatively, for H1 and H2, an underline-ish style:

    Alt-H1

    Alt-H2

    Emphasis

    Emphasis, aka italics, with *asterisks* or _underscores_.
    
    Strong emphasis, aka bold, with **asterisks** or __underscores__.
    
    Combined emphasis with **asterisks and _underscores_**.
    
    Strikethrough uses two tildes. ~~Scratch this.~~
    

    Emphasis, aka italics, with asterisks or underscores.

    Strong emphasis, aka bold, with asterisks or underscores.

    Combined emphasis with asterisks and underscores.

    Strikethrough uses two tildes. Scratch this.

    Lists

    (In this example, leading and trailing spaces are shown with with dots: ⋅)

    1. First ordered list item
    2. Another item
    ⋅⋅* Unordered sub-list. 
    1. Actual numbers don't matter, just that it's a number
    ⋅⋅1. Ordered sub-list
    4. And another item.
    
    ⋅⋅⋅You can have properly indented paragraphs within list items. Notice the blank line above, and the leading spaces (at least one, but we'll use three here to also align the raw Markdown).
    
    ⋅⋅⋅To have a line break without a paragraph, you will need to use two trailing spaces.⋅⋅
    ⋅⋅⋅Note that this line is separate, but within the same paragraph.⋅⋅
    ⋅⋅⋅(This is contrary to the typical GFM line break behaviour, where trailing spaces are not required.)
    
    * Unordered list can use asterisks
    - Or minuses
    + Or pluses
    
    1. First ordered list item

    2. Another item

      • Unordered sub-list.
    3. Actual numbers don’t matter, just that it’s a number

      1. Ordered sub-list
    4. And another item.

      You can have properly indented paragraphs within list items. Notice the blank line above, and the leading spaces (at least one, but we’ll use three here to also align the raw Markdown).

      To have a line break without a paragraph, you will need to use two trailing spaces.
      Note that this line is separate, but within the same paragraph.
      (This is contrary to the typical GFM line break behaviour, where trailing spaces are not required.)

    • Unordered list can use asterisks
    • Or minuses
    • Or pluses

    There are two ways to create links.

    [I'm an inline-style link](https://www.google.com)
    
    [I'm an inline-style link with title](https://www.google.com "Google's Homepage")
    
    [I'm a reference-style link][Arbitrary case-insensitive reference text]
    
    [I'm a relative reference to a repository file](../blob/master/LICENSE)
    
    [You can use numbers for reference-style link definitions][1]
    
    Or leave it empty and use the [link text itself].
    
    URLs and URLs in angle brackets will automatically get turned into links. 
    http://www.example.com or <http://www.example.com> and sometimes 
    example.com (but not on Github, for example).
    
    Some text to show that the reference links can follow later.
    
    [arbitrary case-insensitive reference text]: https://www.mozilla.org
    [1]: http://slashdot.org
    [link text itself]: http://www.reddit.com
    

    I’m an inline-style link

    I’m an inline-style link with title

    I’m a reference-style link

    I’m a relative reference to a repository file

    You can use numbers for reference-style link definitions

    Or leave it empty and use the link text itself.

    URLs and URLs in angle brackets will automatically get turned into links.
    http://www.example.com or http://www.example.com and sometimes
    example.com (but not on Github, for example).

    Some text to show that the reference links can follow later.

    Images

    Here's our logo (hover to see the title text):
    
    Inline-style: 
    ![alt text](https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png "Logo Title Text 1")
    
    Reference-style: 
    ![alt text][logo]
    
    [logo]: https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png "Logo Title Text 2"
    

    Here’s our logo (hover to see the title text):

    Inline-style:
    alt text

    Reference-style:
    alt text

    Code and Syntax Highlighting

    Code blocks are part of the Markdown spec, but syntax highlighting isn’t. However, many renderers – like Github’s and Markdown Here – support syntax highlighting. Which languages are supported and how those language names should be written will vary from renderer to renderer. Markdown Here supports highlighting for dozens of languages (and not-really-languages, like diffs and HTTP headers); to see the complete list, and how to write the language names, see the highlight.js demo page.

    Inline `code` has `back-ticks around` it.
    

    Inline code has back-ticks around it.

    Blocks of code are either fenced by lines with three back-ticks ```, or are indented with four spaces. I recommend only using the fenced code blocks – they’re easier and only they support syntax highlighting.

    ```javascript
    var s = "JavaScript syntax highlighting";
    alert(s);
    ```
     
    ```python
    s = "Python syntax highlighting"
    print s
    ```
     
    ```
    No language indicated, so no syntax highlighting. 
    But let's throw in a <b>tag</b>.
    ```
    
    var s = "JavaScript syntax highlighting";
    alert(s);
    
    s = "Python syntax highlighting"
    print s
    
    No language indicated, so no syntax highlighting in Markdown Here (varies on Github). 
    But let's throw in a <b>tag</b>.
    

    Tables

    Tables aren’t part of the core Markdown spec, but they are part of GFM and Markdown Here supports them. They are an easy way of adding tables to your email – a task that would otherwise require copy-pasting from another application.

    Colons can be used to align columns.
    
    | Tables        | Are           | Cool  |
    | ------------- |:-------------:| -----:|
    | col 3 is      | right-aligned | $1600 |
    | col 2 is      | centered      |   $12 |
    | zebra stripes | are neat      |    $1 |
    
    There must be at least 3 dashes separating each header cell.
    The outer pipes (|) are optional, and you don't need to make the 
    raw Markdown line up prettily. You can also use inline Markdown.
    
    Markdown | Less | Pretty
    --- | --- | ---
    *Still* | `renders` | **nicely**
    1 | 2 | 3
    

    Colons can be used to align columns.

    Tables Are Cool
    col 3 is right-aligned $1600
    col 2 is centered $12
    zebra stripes are neat $1

    There must be at least 3 dashes separating each header cell. The outer pipes (|) are optional, and you don’t need to make the raw Markdown line up prettily. You can also use inline Markdown.

    Markdown Less Pretty
    Still renders nicely
    1 2 3

    Blockquotes

    > Blockquotes are very handy in email to emulate reply text.
    > This line is part of the same quote.
    
    Quote break.
    
    > This is a very long line that will still be quoted properly when it wraps. Oh boy let's keep writing to make sure this is long enough to actually wrap for everyone. Oh, you can *put* **Markdown** into a blockquote. 
    

    Blockquotes are very handy in email to emulate reply text.
    This line is part of the same quote.

    Quote break.

    This is a very long line that will still be quoted properly when it wraps. Oh boy let’s keep writing to make sure this is long enough to actually wrap for everyone. Oh, you can put Markdown into a blockquote.

    Inline HTML

    You can also use raw HTML in your Markdown, and it’ll mostly work pretty well.

    <dl>
      <dt>Definition list</dt>
      <dd>Is something people use sometimes.</dd>
    
      <dt>Markdown in HTML</dt>
      <dd>Does *not* work **very** well. Use HTML <em>tags</em>.</dd>
    </dl>
    
    Definition list
    Is something people use sometimes.
    Markdown in HTML
    Does *not* work **very** well. Use HTML tags.

    Horizontal Rule

    Three or more...
    
    ---
    
    Hyphens
    
    ***
    
    Asterisks
    
    ___
    
    Underscores
    

    Three or more…


    Hyphens


    Asterisks


    Underscores

    Line Breaks

    My basic recommendation for learning how line breaks work is to experiment and discover – hit <Enter> once (i.e., insert one newline), then hit it twice (i.e., insert two newlines), see what happens. You’ll soon learn to get what you want. “Markdown Toggle” is your friend.

    Here are some things to try out:

    Here's a line for us to start with.
    
    This line is separated from the one above by two newlines, so it will be a *separate paragraph*.
    
    This line is also a separate paragraph, but...
    This line is only separated by a single newline, so it's a separate line in the *same paragraph*.
    

    Here’s a line for us to start with.

    This line is separated from the one above by two newlines, so it will be a separate paragraph.

    This line is also begins a separate paragraph, but…
    This line is only separated by a single newline, so it’s a separate line in the same paragraph.

    (Technical note: Markdown Here uses GFM line breaks, so there’s no need to use MD’s two-space line breaks.)

    YouTube Videos

    They can’t be added directly but you can add an image with a link to the video like this:

    <a href="http://www.youtube.com/watch?feature=player_embedded&v=YOUTUBE_VIDEO_ID_HERE
    " target="_blank"><img src="http://img.youtube.com/vi/YOUTUBE_VIDEO_ID_HERE/0.jpg" 
    alt="IMAGE ALT TEXT HERE" width="240" height="180" border="10" /></a>
    

    Or, in pure Markdown, but losing the image sizing and border:

    [![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/YOUTUBE_VIDEO_ID_HERE/0.jpg)](http://www.youtube.com/watch?v=YOUTUBE_VIDEO_ID_HERE)
    

    Referencing a bug by #bugID in your git commit links it to the slip. For example #1.


    License: CC-BY

    html2md-0.2.15/test-samples/markdown-cheatsheet.md000064400000000000000000000263631046102023000201720ustar 00000000000000This is intended as a quick reference and showcase. For more complete info, see [John Gruber's original spec](http://daringfireball.net/projects/markdown/) and the [Github-flavored Markdown info page](http://github.github.com/github-flavored-markdown/). Note that there is also a [Cheatsheet specific to Markdown Here](./Markdown-Here-Cheatsheet) if that's what you're looking for. You can also check out [more Markdown tools](./Other-Markdown-Tools). ##### Table of Contents [Headers](#headers) [Emphasis](#emphasis) [Lists](#lists) [Links](#links) [Images](#images) [Code and Syntax Highlighting](#code) [Tables](#tables) [Blockquotes](#blockquotes) [Inline HTML](#html) [Horizontal Rule](#hr) [Line Breaks](#lines) [YouTube Videos](#videos) ## Headers ```no-highlight # H1 ## H2 ### H3 #### H4 ##### H5 ###### H6 Alternatively, for H1 and H2, an underline-ish style: Alt-H1 ====== Alt-H2 ------ ``` # H1 ## H2 ### H3 #### H4 ##### H5 ###### H6 Alternatively, for H1 and H2, an underline-ish style: Alt-H1 ====== Alt-H2 ------ ## Emphasis ```no-highlight Emphasis, aka italics, with *asterisks* or _underscores_. Strong emphasis, aka bold, with **asterisks** or __underscores__. Combined emphasis with **asterisks and _underscores_**. Strikethrough uses two tildes. ~~Scratch this.~~ ``` Emphasis, aka italics, with *asterisks* or _underscores_. Strong emphasis, aka bold, with **asterisks** or __underscores__. Combined emphasis with **asterisks and _underscores_**. Strikethrough uses two tildes. ~~Scratch this.~~ ## Lists (In this example, leading and trailing spaces are shown with with dots: ⋅) ```no-highlight 1. First ordered list item 2. Another item ⋅⋅* Unordered sub-list. 1. Actual numbers don't matter, just that it's a number ⋅⋅1. Ordered sub-list 4. And another item. ⋅⋅⋅You can have properly indented paragraphs within list items. Notice the blank line above, and the leading spaces (at least one, but we'll use three here to also align the raw Markdown). ⋅⋅⋅To have a line break without a paragraph, you will need to use two trailing spaces.⋅⋅ ⋅⋅⋅Note that this line is separate, but within the same paragraph.⋅⋅ ⋅⋅⋅(This is contrary to the typical GFM line break behaviour, where trailing spaces are not required.) * Unordered list can use asterisks - Or minuses + Or pluses ``` 1. First ordered list item 2. Another item * Unordered sub-list. 1. Actual numbers don't matter, just that it's a number 1. Ordered sub-list 4. And another item. You can have properly indented paragraphs within list items. Notice the blank line above, and the leading spaces (at least one, but we'll use three here to also align the raw Markdown). To have a line break without a paragraph, you will need to use two trailing spaces. Note that this line is separate, but within the same paragraph. (This is contrary to the typical GFM line break behaviour, where trailing spaces are not required.) * Unordered list can use asterisks - Or minuses + Or pluses ## Links There are two ways to create links. ```no-highlight [I'm an inline-style link](https://www.google.com) [I'm an inline-style link with title](https://www.google.com "Google's Homepage") [I'm a reference-style link][Arbitrary case-insensitive reference text] [I'm a relative reference to a repository file](../blob/master/LICENSE) [You can use numbers for reference-style link definitions][1] Or leave it empty and use the [link text itself]. URLs and URLs in angle brackets will automatically get turned into links. http://www.example.com or and sometimes example.com (but not on Github, for example). Some text to show that the reference links can follow later. [arbitrary case-insensitive reference text]: https://www.mozilla.org [1]: http://slashdot.org [link text itself]: http://www.reddit.com ``` [I'm an inline-style link](https://www.google.com) [I'm an inline-style link with title](https://www.google.com "Google's Homepage") [I'm a reference-style link][Arbitrary case-insensitive reference text] [I'm a relative reference to a repository file](../blob/master/LICENSE) [You can use numbers for reference-style link definitions][1] Or leave it empty and use the [link text itself]. URLs and URLs in angle brackets will automatically get turned into links. http://www.example.com or and sometimes example.com (but not on Github, for example). Some text to show that the reference links can follow later. [arbitrary case-insensitive reference text]: https://www.mozilla.org [1]: http://slashdot.org [link text itself]: http://www.reddit.com ## Images ```no-highlight Here's our logo (hover to see the title text): Inline-style: ![alt text](https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png "Logo Title Text 1") Reference-style: ![alt text][logo] [logo]: https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png "Logo Title Text 2" ``` Here's our logo (hover to see the title text): Inline-style: ![alt text](https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png "Logo Title Text 1") Reference-style: ![alt text][logo] [logo]: https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png "Logo Title Text 2" ## Code and Syntax Highlighting Code blocks are part of the Markdown spec, but syntax highlighting isn't. However, many renderers -- like Github's and *Markdown Here* -- support syntax highlighting. Which languages are supported and how those language names should be written will vary from renderer to renderer. *Markdown Here* supports highlighting for dozens of languages (and not-really-languages, like diffs and HTTP headers); to see the complete list, and how to write the language names, see the [highlight.js demo page](http://softwaremaniacs.org/media/soft/highlight/test.html). ```no-highlight Inline `code` has `back-ticks around` it. ``` Inline `code` has `back-ticks around` it. Blocks of code are either fenced by lines with three back-ticks ```, or are indented with four spaces. I recommend only using the fenced code blocks -- they're easier and only they support syntax highlighting.
    ```javascript
    var s = "JavaScript syntax highlighting";
    alert(s);
    ```
     
    ```python
    s = "Python syntax highlighting"
    print s
    ```
     
    ```
    No language indicated, so no syntax highlighting. 
    But let's throw in a <b>tag</b>.
    ```
    
    ```javascript var s = "JavaScript syntax highlighting"; alert(s); ``` ```python s = "Python syntax highlighting" print s ``` ``` No language indicated, so no syntax highlighting in Markdown Here (varies on Github). But let's throw in a tag. ```
    ## Tables Tables aren't part of the core Markdown spec, but they are part of GFM and *Markdown Here* supports them. They are an easy way of adding tables to your email -- a task that would otherwise require copy-pasting from another application. ```no-highlight Colons can be used to align columns. | Tables | Are | Cool | | ------------- |:-------------:| -----:| | col 3 is | right-aligned | $1600 | | col 2 is | centered | $12 | | zebra stripes | are neat | $1 | There must be at least 3 dashes separating each header cell. The outer pipes (|) are optional, and you don't need to make the raw Markdown line up prettily. You can also use inline Markdown. Markdown | Less | Pretty --- | --- | --- *Still* | `renders` | **nicely** 1 | 2 | 3 ``` Colons can be used to align columns. | Tables | Are | Cool | | ------------- |:-------------:| -----:| | col 3 is | right-aligned | $1600 | | col 2 is | centered | $12 | | zebra stripes | are neat | $1 | There must be at least 3 dashes separating each header cell. The outer pipes (|) are optional, and you don't need to make the raw Markdown line up prettily. You can also use inline Markdown. Markdown | Less | Pretty --- | --- | --- *Still* | `renders` | **nicely** 1 | 2 | 3 ## Blockquotes ```no-highlight > Blockquotes are very handy in email to emulate reply text. > This line is part of the same quote. Quote break. > This is a very long line that will still be quoted properly when it wraps. Oh boy let's keep writing to make sure this is long enough to actually wrap for everyone. Oh, you can *put* **Markdown** into a blockquote. ``` > Blockquotes are very handy in email to emulate reply text. > This line is part of the same quote. Quote break. > This is a very long line that will still be quoted properly when it wraps. Oh boy let's keep writing to make sure this is long enough to actually wrap for everyone. Oh, you can *put* **Markdown** into a blockquote. ## Inline HTML You can also use raw HTML in your Markdown, and it'll mostly work pretty well. ```no-highlight
    Definition list
    Is something people use sometimes.
    Markdown in HTML
    Does *not* work **very** well. Use HTML tags.
    ```
    Definition list
    Is something people use sometimes.
    Markdown in HTML
    Does *not* work **very** well. Use HTML tags.
    ## Horizontal Rule ``` Three or more... --- Hyphens *** Asterisks ___ Underscores ``` Three or more... --- Hyphens *** Asterisks ___ Underscores ## Line Breaks My basic recommendation for learning how line breaks work is to experiment and discover -- hit <Enter> once (i.e., insert one newline), then hit it twice (i.e., insert two newlines), see what happens. You'll soon learn to get what you want. "Markdown Toggle" is your friend. Here are some things to try out: ``` Here's a line for us to start with. This line is separated from the one above by two newlines, so it will be a *separate paragraph*. This line is also a separate paragraph, but... This line is only separated by a single newline, so it's a separate line in the *same paragraph*. ``` Here's a line for us to start with. This line is separated from the one above by two newlines, so it will be a *separate paragraph*. This line is also begins a separate paragraph, but... This line is only separated by a single newline, so it's a separate line in the *same paragraph*. (Technical note: *Markdown Here* uses GFM line breaks, so there's no need to use MD's two-space line breaks.) ## YouTube Videos They can't be added directly but you can add an image with a link to the video like this: ```no-highlight IMAGE ALT TEXT HERE ``` Or, in pure Markdown, but losing the image sizing and border: ```no-highlight [![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/YOUTUBE_VIDEO_ID_HERE/0.jpg)](http://www.youtube.com/watch?v=YOUTUBE_VIDEO_ID_HERE) ``` Referencing a bug by #bugID in your git commit links it to the slip. For example #1. --- License: [CC-BY](https://creativecommons.org/licenses/by/3.0/)html2md-0.2.15/test-samples/md-syntax-sample.html000064400000000000000000000153201046102023000177730ustar 00000000000000

    An h1 header

    Paragraphs are separated by a blank line.

    2nd paragraph. Italic, bold, and monospace. Itemized lists
    look like:

    • this one
    • that one
    • the other one

    Note that — not considering the asterisk — the actual text
    content starts at 4-columns in.

    Block quotes are
    written like so.

    They can span multiple paragraphs,
    if you like.

    Use 3 dashes for an em-dash. Use 2 dashes for ranges (ex., “it’s all
    in chapters 12–14”). Three dots … will be converted to an ellipsis.
    Unicode is supported. ☺

    An h2 header

    Here’s a numbered list:

    1. first item
    2. second item
    3. third item

    Note again how the actual text starts at 4 columns in (4 characters
    from the left side). Here’s a code sample:

    # Let me re-iterate ...
    for i in 1 .. 10 { do-something(i) }
    

    As you probably guessed, indented 4 spaces. By the way, instead of
    indenting the block, you can use delimited blocks, if you like:

    define foobar() {
        print "Welcome to flavor country!";
    }
    

    (which makes copying & pasting easier). You can optionally mark the
    delimited block for Pandoc to syntax highlight it:

    import time
    # Quick, count to ten!
    for i in range(10):
        # (but not *too* quick)
        time.sleep(0.5)
        print i
    

    An h3 header

    Now a nested list:

    1. First, get these ingredients:

      • carrots
      • celery
      • lentils
    2. Boil some water.

    3. Dump everything in the pot and follow
      this algorithm:

      find wooden spoon
      uncover pot
      stir
      cover pot
      balance wooden spoon precariously on pot handle
      wait 10 minutes
      goto first step (or shut off burner when done)
      

      Do not bump wooden spoon or it will fall.

    Notice again how text always lines up on 4-space indents (including
    that last line which continues item 3 above).

    Here’s a link to a website, to a local
    doc
    , and to a section heading in the current
    doc
    . Here’s a footnote 1.

    Tables can look like this:

    size material color


    9 leather brown
    10 hemp canvas natural
    11 glass transparent

    Table: Shoes, their sizes, and what they’re made of

    (The above is the caption for the table.) Pandoc also supports
    multi-line tables:


    keyword text


    red Sunsets, apples, and
    other red or reddish
    things.

    green Leaves, grass, frogs
    and other things it’s
    not easy being.


    A horizontal rule follows.


    Here’s a definition list:

    apples
    Good for making applesauce.
    oranges
    Citrus!
    tomatoes
    There’s no “e” in tomatoe.

    Again, text is indented 4 spaces. (Put a blank line between each
    term/definition pair to spread things out more.)

    Here’s a “line block”:

    | Line one
    | Line too
    | Line tree

    and images can be specified like so:

    example image

    Inline math equations go in like so: ω=dϕ/dt\omega = d\phi / dt. Display
    math should get its own line and be put in in double-dollarsigns:

    I=ρR2dVI = \int \rho R^{2} dV

    And note that you can backslash-escape any punctuation characters
    which you wish to be displayed literally, ex.: `foo`, *bar*, etc.


    1. Footnote text goes here. ↩︎

    html2md-0.2.15/test-samples/md-syntax-sample.md000064400000000000000000000065061046102023000174350ustar 00000000000000An h1 header ============ Paragraphs are separated by a blank line. 2nd paragraph. *Italic*, **bold**, and `monospace`. Itemized lists look like: * this one * that one * the other one Note that --- not considering the asterisk --- the actual text content starts at 4-columns in. > Block quotes are > written like so. > > They can span multiple paragraphs, > if you like. Use 3 dashes for an em-dash. Use 2 dashes for ranges (ex., "it's all in chapters 12--14"). Three dots ... will be converted to an ellipsis. Unicode is supported. ☺ An h2 header ------------ Here's a numbered list: 1. first item 2. second item 3. third item Note again how the actual text starts at 4 columns in (4 characters from the left side). Here's a code sample: # Let me re-iterate ... for i in 1 .. 10 { do-something(i) } As you probably guessed, indented 4 spaces. By the way, instead of indenting the block, you can use delimited blocks, if you like: ~~~ define foobar() { print "Welcome to flavor country!"; } ~~~ (which makes copying & pasting easier). You can optionally mark the delimited block for Pandoc to syntax highlight it: ~~~python import time # Quick, count to ten! for i in range(10): # (but not *too* quick) time.sleep(0.5) print i ~~~ ### An h3 header ### Now a nested list: 1. First, get these ingredients: * carrots * celery * lentils 2. Boil some water. 3. Dump everything in the pot and follow this algorithm: find wooden spoon uncover pot stir cover pot balance wooden spoon precariously on pot handle wait 10 minutes goto first step (or shut off burner when done) Do not bump wooden spoon or it will fall. Notice again how text always lines up on 4-space indents (including that last line which continues item 3 above). Here's a link to [a website](http://foo.bar), to a [local doc](local-doc.html), and to a [section heading in the current doc](#an-h2-header). Here's a footnote [^1]. [^1]: Footnote text goes here. Tables can look like this: size material color ---- ------------ ------------ 9 leather brown 10 hemp canvas natural 11 glass transparent Table: Shoes, their sizes, and what they're made of (The above is the caption for the table.) Pandoc also supports multi-line tables: -------- ----------------------- keyword text -------- ----------------------- red Sunsets, apples, and other red or reddish things. green Leaves, grass, frogs and other things it's not easy being. -------- ----------------------- A horizontal rule follows. *** Here's a definition list: apples : Good for making applesauce. oranges : Citrus! tomatoes : There's no "e" in tomatoe. Again, text is indented 4 spaces. (Put a blank line between each term/definition pair to spread things out more.) Here's a "line block": | Line one | Line too | Line tree and images can be specified like so: ![example image](example-image.jpg "An exemplary image") Inline math equations go in like so: $\omega = d\phi / dt$. Display math should get its own line and be put in in double-dollarsigns: $$I = \int \rho R^{2} dV$$ And note that you can backslash-escape any punctuation characters which you wish to be displayed literally, ex.: \`foo\`, \*bar\*, etc.html2md-0.2.15/tests/iframes.rs000064400000000000000000000021071046102023000144160ustar 00000000000000extern crate html2md; use html2md::parse_html; use pretty_assertions::assert_eq; #[test] fn test_youtube_simple() { let md = parse_html(""); assert_eq!(md, "[![Embedded YouTube video](https://img.youtube.com/vi/zE-dmXZp3nU/0.jpg)](https://www.youtube.com/watch?v=zE-dmXZp3nU)") } #[test] fn test_instagram_simple() { let md = parse_html(""); assert_eq!(md, "[![Embedded Instagram post](https://www.instagram.com/p/B1BKr9Wo8YX/media/?size=m)](https://www.instagram.com/p/B1BKr9Wo8YX/embed/)") } #[test] fn test_vkontakte_simple() { let md = parse_html(""); assert_eq!(md, "[![Embedded VK video](https://st.vk.com/images/icons/video_empty_2x.png)](https://vk.com/video-76477496_456239454)") }html2md-0.2.15/tests/images.rs000064400000000000000000000055261046102023000142450ustar 00000000000000extern crate html2md; use html2md::parse_html; use pretty_assertions::assert_eq; #[test] fn test_image_native_simple() { let md = parse_html("\"image"); assert_eq!(md, "![image of Linus holding his laptop](https://i.redd.it/vesfbmwfkz811.png \"Daddy Linus\")") } #[test] fn test_image_native_without_title() { let md = parse_html("\"image"); assert_eq!(md, "![image of usual kill -9 sequence](https://i.redd.it/l0ne52x7fh611.png)") } #[test] fn test_image_embedded_html() { let md = parse_html("\"comics"); assert_eq!(md, "\"comics") } #[test] fn test_image_embedded_with_unsupported_html() { // srcset is unsupported in Markdown let md = parse_html("\"HACKERMAN\""); assert_eq!(md, "\"HACKERMAN\"") } #[test] fn test_image_src_issue() { let md = parse_html(""); assert_eq!(md, "") } #[test] fn test_image_with_space_issue() { let md = parse_html("\"image"); assert_eq!(md, "![image of usual kill -9 sequence](https://i.redd.it/l0ne%2052x7f%20h611.png)") } #[test] fn test_image_with_query_issue() { let md = parse_html(""); assert_eq!(md, "![](https://instagram.ftll1-1.fna.fbcdn.net/vp/4c753762a3cd58ec2cd55f7e20f87e5c/5D39A8B3/t51.2885-15/sh0.08/e35/p640x640/54511922_267736260775264_8482507773977053160_n.jpg?_nc_ht=instagram.ftll1-1.fna.fbcdn.net)") } #[test] fn test_image_with_unsupported_html_and_quotes_in_alt() { let md = parse_html(r#"A "pipe""#); assert_eq!(md, r#"A "pipe""#) } html2md-0.2.15/tests/integration.rs000064400000000000000000000070251046102023000153170ustar 00000000000000extern crate html2md; extern crate spectral; use html2md::parse_html; use std::fs::File; use std::io::prelude::*; use spectral::prelude::*; use indoc::indoc; #[test] #[ignore] fn test_marcfs() { let mut html = String::new(); let mut html_file = File::open("test-samples/marcfs-readme.html").unwrap(); html_file.read_to_string(&mut html).expect("File must be readable"); let result = parse_html(&html); println!("{}", result); } #[test] #[ignore] fn test_cheatsheet() { let mut html = String::new(); let mut md = String::new(); let mut html_file = File::open("test-samples/markdown-cheatsheet.html").unwrap(); let mut md_file = File::open("test-samples/markdown-cheatsheet.md").unwrap(); html_file.read_to_string(&mut html).expect("File must be readable"); md_file.read_to_string(&mut md).expect("File must be readable"); let md_parsed = parse_html(&html); println!("{}", md_parsed); //assert_eq!(md, md_parsed); } /// newlines after list shouldn't be converted into text of the last list element #[test] fn test_list_newlines() { let mut html = String::new(); let mut html_file = File::open("test-samples/dybr-bug-with-list-newlines.html").unwrap(); html_file.read_to_string(&mut html).expect("File must be readable"); let result = parse_html(&html); assert_that(&result).contains(".\n\nxxx xxxx"); assert_that(&result).contains("xx x.\n\nxxxxx:"); } #[test] fn test_lists_from_text() { let mut html = String::new(); let mut html_file = File::open("test-samples/dybr-bug-with-lists-from-text.html").unwrap(); html_file.read_to_string(&mut html).expect("File must be readable"); let result = parse_html(&html); assert_that(&result).contains("\\- x xxxx xxxxx xx xxxxxxxxxx"); assert_that(&result).contains("\\- x xxxx xxxxxxxx xxxxxxxxx xxxxxx xxx x xxxxxxxx xxxx"); assert_that(&result).contains("\\- xxxx xxxxxxxx"); } #[test] fn test_strong_inside_link() { let mut html = String::new(); let mut html_file = File::open("test-samples/dybr-bug-with-strong-inside-link.html").unwrap(); html_file.read_to_string(&mut html).expect("File must be readable"); let result = parse_html(&html); assert_that(&result).contains("[**Just God**](http://fanfics.me/ficXXXXXXX)"); } #[test] fn test_tables_with_newlines() { let mut html = String::new(); let mut html_file = File::open("test-samples/dybr-bug-with-tables-masked.html").unwrap(); html_file.read_to_string(&mut html).expect("File must be readable"); let result = parse_html(&html); // all lines starting with | should end with | as well let invalid_table_lines: Vec<&str> = result.lines() .filter(|line| line.starts_with("|")) .filter(|line| !line.ends_with("|")) .collect(); assert_that(&invalid_table_lines).is_empty(); } #[test] fn test_tables_crash2() { let mut html = String::new(); let mut html_file = File::open("test-samples/dybr-bug-with-tables-2-masked.html").unwrap(); html_file.read_to_string(&mut html).expect("File must be readable"); let table_with_vertical_header = parse_html(&html); assert_that!(table_with_vertical_header).contains(indoc! {" |Current Conditions:|Open all year. No reservations. No services.| |-------------------|--------------------------------------------| | Reservations: | No reservations. | | Fees | No fee. | | Water: | No water. |" }); }html2md-0.2.15/tests/lists.rs000064400000000000000000000132541046102023000141330ustar 00000000000000extern crate html2md; use html2md::parse_html; use pretty_assertions::assert_eq; #[test] fn test_list_simple() { let md = parse_html(r#"

    • Seven things has lady Lackless
    • Keeps them underneath her black dress
    • One a thing that's not for wearing

    "#); assert_eq!(md, "\ * Seven things has lady Lackless * Keeps them underneath her black dress * One a thing that's not for wearing") } #[test] fn test_list_formatted() { // let's use some some broken html let md = parse_html(r#"

    • You should NEVER see this error
      • Broken lines, broken strings
      • Broken threads, broken springs
      • Broken idols, broken heads
      • People sleep in broken beds
    • Ain't no use jiving
    • Ain't no use joking
    • EVERYTHING IS BROKEN "#); assert_eq!(md, "\ * You should NEVER see this error * Broken lines, broken strings * Broken threads, broken springs * Broken idols, broken heads * People sleep in broken beds * Ain't no use jiving * Ain't no use joking * EVERYTHING IS BROKEN") } #[test] fn test_list_stackedit() { let md = parse_html(r#"
      • You should NEVER see this error

        • Broken lines, broken strings

        • Broken threads, broken springs

        • Broken idols, broken heads

        • People sleep in broken beds

      • Ain’t no use jiving

      • Ain’t no use joking

      • EVERYTHING IS BROKEN

      "#); assert_eq!(md, "\ * You should NEVER see this error * Broken lines, broken strings * Broken threads, broken springs * Broken idols, broken heads * People sleep in broken beds * Ain’t no use jiving * Ain’t no use joking * EVERYTHING IS BROKEN") } #[test] fn test_list_stackedit_add_brs() { let md = parse_html(r#"
      • You should NEVER see this error

        • Broken lines, broken strings

        • Broken threads, broken springs

        • Broken idols, broken heads

        • People sleep in broken beds



      • Ain’t no use jiving

      • Ain’t no use joking

      • EVERYTHING IS BROKEN

      "#); assert_eq!(md, "\ * You should NEVER see this error * Broken lines, broken strings * Broken threads, broken springs * Broken idols, broken heads * People sleep in broken beds * Ain’t no use jiving * Ain’t no use joking * EVERYTHING IS BROKEN") } #[test] fn test_list_multiline() { let md = parse_html(r#"
      1. In the heat and the rains

        With whips and chains

        Just to see him fly
        So many die!

      "#); assert_eq!(md, "\ 1. In the heat and the rains With whips and chains Just to see him fly So many die!") } #[test] fn test_list_multiline_formatted() { // let's use some some broken html let md = parse_html(r#"

      • You should NEVER see this error
        • Broken lines, broken strings
        • Broken threads, broken springs
        • Broken idols, broken heads
        • People sleep in broken beds
        • Ain't no use jiving

          Ain't no use joking

          EVERYTHING IS BROKEN

      • "#); assert_eq!(md, "\ * You should NEVER see this error * Broken lines, broken strings * Broken threads, broken springs * Broken idols, broken heads * People sleep in broken beds * Ain't no use jiving Ain't no use joking EVERYTHING IS BROKEN") } #[test] fn test_list_ordered() { // let's use some some broken html let md = parse_html(r#"
        1. Now did you read the news today?
        2. They say the danger's gone away
        3. Well I can see the fire still alight
        4. Burning into the night
        "#); assert_eq!(md, "\ 1. Now did you read the news today? 2. They say the danger's gone away 3. Well I can see the fire still alight 4. Burning into the night") } #[test] fn test_list_text_prevsibling() { let md = parse_html(r#" Phrases to describe me:
        • Awesome
        • Cool
        • Awesome and cool
        • Can count to five
        • Learning to count to six B)
        "#); assert_eq!(md, "\ Phrases to describe me: * Awesome * Cool * Awesome and cool * Can count to five * Learning to count to six B)") } html2md-0.2.15/tests/quotes.rs000064400000000000000000000024741046102023000143170ustar 00000000000000extern crate html2md; use html2md::parse_html; use pretty_assertions::assert_eq; use indoc::indoc; #[test] fn test_quotes() { let md = parse_html("

        here's a quote\n next line of it
        And some text after it

        "); assert_eq!(md, "\ > here's a quote next line of it And some text after it") } #[test] fn test_quotes2() { let md = parse_html("

        here's
        nested quote!
        a quote\n next line of it

        "); assert_eq!(md, "\ > here's > > nested quote! > > a quote next line of it") } #[test] fn test_blockquotes() { let md = parse_html("
        Quote at the start of the message
        Should not crash the parser"); assert_eq!(md, "\ > Quote at the start of the message Should not crash the parser") } #[test] fn test_details() { let html = indoc! {"
        There are more things in heaven and Earth, Horatio

        Than are dreamt of in your philosophy

        "}; let md = parse_html(&html); assert_eq!(md, "
        There are more things in heaven and Earth, **Horatio**\n\nThan are dreamt of in your philosophy\n\n
        ") } #[test] fn test_subsup() { let md = parse_html("X2"); assert_eq!(md, r#"X2"#) }html2md-0.2.15/tests/styles.rs000064400000000000000000000015211046102023000143120ustar 00000000000000extern crate html2md; use html2md::parse_html; use pretty_assertions::assert_eq; #[test] fn test_styles_with_spaces() { let md = parse_html(r#"It read: Nobody will ever love you"#); assert_eq!(md, r#"It read: ~~Nobody will ever love you~~"#) } #[test] fn test_styles_with_newlines() { let md = parse_html(r#" And she said:
        We are all just prisoners here
        Of our own device
        And in the master's chambers
        They gathered for the feast
        They stab it with their steely knives
        But they just can't kill the beast
        "#); assert_eq!(md, "\ And she said: ~~We are all just prisoners here Of our own device~~ And in the master's chambers They gathered for the feast *They stab it with their steely knives* **But they just can't kill the beast**") }html2md-0.2.15/tests/tables.rs000064400000000000000000000115621046102023000142470ustar 00000000000000extern crate html2md; use html2md::parse_html; use pretty_assertions::assert_eq; #[test] fn test_tables() { let md = parse_html(r#"
        Minor1 Minor2 Minor3 Minor4
        col1 col2 col3 col4
        "#); assert_eq!(md, "\ |Minor1|Minor2|Minor3|Minor4| |------|------|------|------| | col1 | col2 | col3 | col4 |"); } #[test] fn test_tables_invalid_more_headers() { let md = parse_html(r#"
        Minor1 Minor2 Minor3 Minor4 Minor5 Minor6
        col1 col2 col3 col4
        "#); assert_eq!(md, "\ |Minor1|Minor2|Minor3|Minor4|Minor5|Minor6| |------|------|------|------|------|------| | col1 | col2 | col3 | col4 | | |"); } #[test] fn test_tables_invalid_more_rows() { let md = parse_html(r#"
        Minor1 Minor2
        col1 col2 col3 col4
        "#); assert_eq!(md, "\ |Minor1|Minor2| | | |------|------|----|----| | col1 | col2 |col3|col4|"); } #[test] fn test_tables_odd_column_width() { let md = parse_html(r#"
        Minor Major
        col1 col2
        "#); assert_eq!(md, "\ |Minor|Major| |-----|-----| |col1 |col2 |"); } #[test] fn test_tables_alignment() { let md = parse_html(r#"
        Minor1 Minor2 Minor3 Minor4
        col1 col2 col3 col4
        "#); assert_eq!(md, "\ |Minor1|Minor2|Minor3|Minor4| |-----:|:----:|-----:|:-----| | col1 | col2 | col3 | col4 |"); } #[test] fn test_tables_wild_example() { let md = parse_html(r#"
        One ring
        Patterns
        Titanic



        One ring to rule them all
        There's one for the sorrow
        Roll on, Titanic, roll



        One ring to find them
        And two for the joy
        You're the pride of White Star Line



        One ring to bring them all
        And three for the girls
        Roll on, Titanic, roll



        And in the darkness bind them
        And four for the boys
        Into the mists of time



        "#); assert_eq!(md, "\ | One ring | Patterns | Titanic | | | | |-----------------------------|--------------------------|-----------------------------------|---|---|---| | One ring to rule them all |There's one for the sorrow| Roll on, Titanic, roll | | | | | One ring to find them | And two for the joy |You're the pride of White Star Line| | | | | One ring to bring them all | And three for the girls | Roll on, Titanic, roll | | | | |And in the darkness bind them| And four for the boys | Into the mists of time | | | |"); }html2md-0.2.15/tests/unit.rs000064400000000000000000000075401046102023000137550ustar 00000000000000extern crate html2md; use html2md::parse_html; use pretty_assertions::assert_eq; #[test] fn test_dumb() { let md = parse_html("

        CARTHAPHILUS

        "); assert_eq!(md, "CARTHAPHILUS") } #[test] fn test_anchor() { let md = parse_html(r#"

        APOSIMZ

        "#); assert_eq!(md, "[APOSIMZ](http://ya.ru)") } #[test] fn test_anchor2() { let md = parse_html(r#"

        APOSIMZSIDONIA

        "#); assert_eq!(md, "[APOSIMZ](http://ya.ru)[SIDONIA](http://yandex.ru)") } #[test] fn test_anchor3() { let md = parse_html(r#"

        APOSIMZ

        SIDONIA

        "#); assert_eq!(md, "\ [APOSIMZ](http://ya.ru) [SIDONIA](http://yandex.ru)") #[test] fn test_anchor_with_name_attribute_is_preserved() { let md = parse_html(r#"

        "#); assert_eq!(md, r#""#) } #[test] fn test_image() { let md = parse_html(r#"

        Gitter
        "#); assert_eq!(md, "[![Gitter](https://img.shields.io/gitter/room/MARC-FS/MARC-FS.svg)](https://gitter.im/MARC-FS/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)") } #[test] fn test_escaping() { let md = parse_html(r#"

        *god*'s in his **heaven** - all is right with the __world__

        "#); assert_eq!(md, "\\*god\\*\'s in his \\*\\*heaven\\*\\* - all is right with the \\_\\_world\\_\\_") } #[test] fn test_escaping_mid_hyphens() { let md = parse_html(r#"

        This is a header with-hyphen!

        "#); assert_eq!(md, "This is a header with-hyphen!\n==========") } #[test] fn test_escaping_start_hyphens() { let md = parse_html(r#"

        - This is a header with starting hyphen!

        "#); assert_eq!(md, "\\- This is a header with starting hyphen!\n==========") } #[test] fn test_escaping_start_sharp() { let md = parse_html("# nothing to worry about"); assert_eq!(md, "\\# nothing to worry about") } /// Note: Also strips multiple spaces #[test] fn test_escaping_start_hyphens_space() { let md = parse_html(r#"

        - This is a header with starting hyphen!

        "#); assert_eq!(md, " \\- This is a header with starting hyphen!\n==========") } #[test] fn test_escaping_html_tags() { let md = parse_html(r#"xxxxxxx xx xxxxxxxxxxx: <iframe src="xxxxxx_xx_xxxxxxxxxxx/embed/" allowfullscreen="" height="725" width="450"></iframe>"#); assert_eq!(md, r#"xxxxxxx xx xxxxxxxxxxx: \