base64-1.0/0000755000000000000000000000000007346545000010627 5ustar0000000000000000base64-1.0/CHANGELOG.md0000644000000000000000000001163207346545000012443 0ustar0000000000000000# Revision history for base64 ## 1.0.0.0 New epoch time! lots of changes to the API. Please see MIGRATION.md for more info: * Add support for GHC 9.x * Bumped bounds to more modern dependency set * Introduced `Base64` newtype, located in `Data.Base64.Types`, as well as ADT for associated alphabets. * Signature for `encodeBase64` has changed so that it produces a `Base64` wrapped value. * `decodeBase64` and its URLsafe variants now consume a value of type `Base64` and produces a decoded value of the underlying type. * The *old* `decodeBase64` and its URLsafe variants is now renamed to `decodeBase64Untyped*`. * Fix for 32-bit systems in which memory barriers were not respected in the encoe loop (see: #56) * Optimize encode and decode loops to achieve *significant speedup* across the board. Thanks to everyone who contributed (especially @sofia-m-a!). ## 0.4.2.4 * Support for GHC 9.2.x * Remove dependency on `ghc-byteorder` * Bump base lower bound to GHC 8.10.x to make sure endianness is properly support (see: https://gitlab.haskell.org/ghc/ghc/-/issues/20338) * Bump upper bound for `deepseq` ## 0.4.2.3 * Tighter length calculations in unpadded base64url [#35](https://github.com/emilypi/base64/pull/35) ## 0.4.2.2 * Add `NFData`, `Exception`, and `Generic` instances for `Base64Error` + `@since` annotations for new instances. ([#28](https://github.com/emilypi/base64/pull/28)) * Doc improvements and add `-XTrustworty` and `-XSafe` annotations where needed. ([#27](https://github.com/emilypi/base64/pull/27)) * Improve URL canonicity validation and correctness checking (now supports correct checking for unpadded Base64url) ([#26](https://github.com/emilypi/base64/pull/26)) * Fixed perf regressions in decode * Test coverage is at 98% ## 0.4.2.1 * Security fix: reject non-canonical base64 encoded values - ([#25](https://github.com/emilypi/base64/pull/25)) * Perf improvements ## 0.4.2 * Added support for `Data.ByteString.Short`, `Data.ByteString.Lazy`, `Data.Text.Short`, and `Data.Text.Lazy`. ([#17](https://github.com/emilypi/base64/pull/17)) * Optimize decode algorithm (now beats `base64-bytestring` in every category!) - ([#13](https://github.com/emilypi/base64/pull/13)) * Use `decodeLatin1` when decoding to text, so that functions are total - ([#13](https://github.com/emilypi/base64/pull/13)) * Added `decodeWith*` variants and a `Base64Error` type to handle decoding errors when decoding base64 values - ([#13](https://github.com/emilypi/base64/pull/13)) * Improved error reporting: all offsets are now precisely accurate. - ([#13](https://github.com/emilypi/base64/pull/13)) * Validations added to head, rejecting invalid corner cases (such as bytestrings of length `l == 1 mod 4`, which are never correct) - ([#16](https://github.com/emilypi/base64/pull/16)) * Added `decodeBase64Padded` for symmetry - ([#13](https://github.com/emilypi/base64/pull/13)) ## 0.4.1 -- 2020-02-04 * Optimize loops for 32-bit and 64-bit architectures * Restructure project to be more amenable to swapping head/tail/loops * Fix module header alignment ## 0.4.0 -- 2020-01-26 * With this major version release, we remove the redundant `encodeBase64Unpadded` and `decodeBase64Unpadded` functions from `Base64.hs`. This is for two reasons: 1. There is no reason for them to exist, since all std base64 is expected to be padded (in contrast to base64url) 2. it was literally redundant with `decodeBase64`. * Use a specialized `Bool` type to give better visual cues regarding which functions add padding ## 0.3.1.1 -- 2020-01-15 * Make sure benchmark code builds ## 0.3.1.0 -- 2020-01-08 * Bug fix for `isBase64` and `isBase64Url` - wrong alphabet was used * Added `isValidBase64` and `isValidBase64Url` for alphabet conformity. The `isBase64*` functions now tell if it's *correct* base64 now in the sense that it's decodable and valid. * Dropped Cabal version to 2.0 for backcompat with Stack * Better documentation ## 0.3.0.0 -- 2020-01-07 * After a discussion with lexilambda, we're making 'encodeBase64' be `ByteString -> Text` by default, offering `ByteString -> ByteString` as a secondary format. * Add `decodeBase64Lenient` to the API for phadej * Fix unpadded decoding bug where garbage was appended to the end of garbage inputs. A cleaner way to do this is to simply encode as Base64 with padding and then strip padding chars until I come up with a workflow specific to unpadded inputs (I used to have this, so I'll have to dig it up) * Added `isBase64` and `isBase64Url` to the API * Performance is stable ## 0.2.0.0 -- 2020-01-05 * After a discussion with phadej, we're doing away with the flags, and splitting the optics out into their own separate library * Removed unnecessary inline pragmas ## 0.1.0.0 -- 2020-01-05 * Do away with the typeclasses, and just provide prisms + synonyms * Continued performance improvements to decoding * Corrected benchmarks ## 0.0.1.0 -- 2020-01-03 * First version. Released on an unsuspecting world. * Preliminary release base64-1.0/LICENSE0000644000000000000000000000276607346545000011647 0ustar0000000000000000Copyright (c) 2019, Emily Pillmore All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Emily Pillmore nor the names of other contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. base64-1.0/MIGRATION-1.0.md0000644000000000000000000004545007346545000013006 0ustar0000000000000000Migration Guide for 1.0 ---- Between the last major version (0.4.2.4) and the current major epoch (1.0), many API-related constructs have changed, and I'd like to justify them here and now so that users may have an immortalized explanation for what is most likely a disruptive change to their code. ## A faster loop First, I'd like to say that I don't *like* breaking people's code. As an author and maintainer, I try and make sure that any API breakages are justified either by a significant UX improvement, or by a measurable performance increase large enough to warrant such a breakage. As such, I believe both of these criteria are met by the 0.4.x -> 1.0 upgrade: not only is the API safer to use, but the use of type data to establish the provenance of values encoded by this library also allows the performance-sensitive loops to be much cleaner, eschewing error checking where type data suffices. To prove this point, I've benchmarked the library between these last two epochs. The benchmarks say it all (all benchmarks are done on a Thinkpad P15 Gen 2 Intel i9-11950H, 64GB DDR4, Ubuntu 22.04 with GHC 9.6.3 stock, -O2): In `base64-0.4.2.4`: ``` benchmarking encode/25/base64-bytestring time 49.97 ns (49.87 ns .. 50.07 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 49.96 ns (49.86 ns .. 50.14 ns) std dev 440.1 ps (235.9 ps .. 806.9 ps) benchmarking encode/25/base64 time 34.07 ns (33.62 ns .. 34.56 ns) 0.999 R² (0.999 R² .. 1.000 R²) mean 33.88 ns (33.77 ns .. 34.08 ns) std dev 504.2 ps (268.7 ps .. 773.8 ps) variance introduced by outliers: 18% (moderately inflated) benchmarking encode/100/base64-bytestring time 111.4 ns (110.6 ns .. 112.4 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 111.7 ns (111.3 ns .. 112.3 ns) std dev 1.787 ns (1.421 ns .. 2.247 ns) variance introduced by outliers: 19% (moderately inflated) benchmarking encode/100/base64 time 53.39 ns (53.19 ns .. 53.72 ns) 1.000 R² (0.999 R² .. 1.000 R²) mean 54.17 ns (53.60 ns .. 56.40 ns) std dev 3.163 ns (1.151 ns .. 6.269 ns) variance introduced by outliers: 78% (severely inflated) benchmarking encode/1k/base64-bytestring time 754.3 ns (750.8 ns .. 759.1 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 766.1 ns (761.4 ns .. 771.5 ns) std dev 17.44 ns (14.17 ns .. 21.34 ns) variance introduced by outliers: 29% (moderately inflated) benchmarking encode/1k/base64 time 274.6 ns (273.2 ns .. 275.9 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 276.5 ns (275.4 ns .. 277.6 ns) std dev 3.863 ns (3.413 ns .. 4.464 ns) variance introduced by outliers: 14% (moderately inflated) benchmarking encode/10k/base64-bytestring time 7.069 μs (7.054 μs .. 7.094 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 7.100 μs (7.088 μs .. 7.114 μs) std dev 44.37 ns (37.56 ns .. 54.14 ns) benchmarking encode/10k/base64 time 2.384 μs (2.364 μs .. 2.415 μs) 0.999 R² (0.998 R² .. 1.000 R²) mean 2.370 μs (2.363 μs .. 2.395 μs) std dev 42.25 ns (12.58 ns .. 86.70 ns) variance introduced by outliers: 18% (moderately inflated) benchmarking encode/100k/base64-bytestring time 70.59 μs (70.26 μs .. 70.84 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 70.11 μs (69.95 μs .. 70.28 μs) std dev 587.0 ns (508.0 ns .. 684.0 ns) benchmarking encode/100k/base64 time 23.31 μs (23.22 μs .. 23.42 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 23.59 μs (23.49 μs .. 23.72 μs) std dev 415.2 ns (343.8 ns .. 509.2 ns) variance introduced by outliers: 14% (moderately inflated) benchmarking encode/1mm/base64-bytestring time 703.6 μs (700.6 μs .. 708.7 μs) 0.999 R² (0.997 R² .. 1.000 R²) mean 703.1 μs (699.8 μs .. 720.0 μs) std dev 18.82 μs (5.505 μs .. 43.88 μs) variance introduced by outliers: 17% (moderately inflated) benchmarking encode/1mm/base64 time 238.4 μs (235.5 μs .. 241.4 μs) 0.999 R² (0.998 R² .. 1.000 R²) mean 234.6 μs (233.4 μs .. 236.5 μs) std dev 4.771 μs (3.256 μs .. 7.810 μs) variance introduced by outliers: 13% (moderately inflated) benchmarking decode/25/base64-bytestring time 54.36 ns (54.18 ns .. 54.55 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 55.11 ns (54.74 ns .. 55.63 ns) std dev 1.441 ns (1.090 ns .. 2.068 ns) variance introduced by outliers: 41% (moderately inflated) benchmarking decode/25/base64 time 53.04 ns (52.57 ns .. 53.80 ns) 0.999 R² (0.999 R² .. 1.000 R²) mean 53.42 ns (53.07 ns .. 53.93 ns) std dev 1.378 ns (1.061 ns .. 1.774 ns) variance introduced by outliers: 40% (moderately inflated) benchmarking decode/100/base64-bytestring time 145.2 ns (143.8 ns .. 146.9 ns) 0.999 R² (0.999 R² .. 1.000 R²) mean 145.3 ns (144.6 ns .. 146.5 ns) std dev 3.165 ns (2.441 ns .. 4.254 ns) variance introduced by outliers: 30% (moderately inflated) benchmarking decode/100/base64 time 140.6 ns (140.0 ns .. 141.2 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 140.6 ns (140.2 ns .. 141.4 ns) std dev 1.984 ns (1.243 ns .. 3.410 ns) variance introduced by outliers: 16% (moderately inflated) benchmarking decode/1k/base64-bytestring time 1.115 μs (1.112 μs .. 1.118 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.120 μs (1.118 μs .. 1.123 μs) std dev 8.290 ns (6.907 ns .. 10.42 ns) benchmarking decode/1k/base64 time 1.109 μs (1.102 μs .. 1.119 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.104 μs (1.102 μs .. 1.108 μs) std dev 9.031 ns (4.358 ns .. 17.14 ns) benchmarking decode/10k/base64-bytestring time 10.86 μs (10.84 μs .. 10.89 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 10.90 μs (10.88 μs .. 10.93 μs) std dev 93.78 ns (71.73 ns .. 143.6 ns) benchmarking decode/10k/base64 time 10.68 μs (10.65 μs .. 10.72 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 10.68 μs (10.66 μs .. 10.70 μs) std dev 51.31 ns (36.41 ns .. 70.46 ns) benchmarking decode/100k/base64-bytestring time 108.4 μs (108.0 μs .. 108.8 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 108.1 μs (108.0 μs .. 108.4 μs) std dev 643.5 ns (450.9 ns .. 1.043 μs) benchmarking decode/100k/base64 time 106.0 μs (105.9 μs .. 106.2 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 106.1 μs (106.0 μs .. 106.3 μs) std dev 586.1 ns (405.8 ns .. 932.3 ns) benchmarking decode/1mm/base64-bytestring time 1.076 ms (1.074 ms .. 1.079 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.080 ms (1.078 ms .. 1.082 ms) std dev 6.833 μs (5.938 μs .. 7.717 μs) benchmarking decode/1mm/base64 time 1.054 ms (1.050 ms .. 1.056 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.051 ms (1.049 ms .. 1.052 ms) std dev 4.359 μs (3.498 μs .. 5.253 μs) ``` vs in `base64-1.0.0.0`: ``` benchmarking encode/25/base64-bytestring time 52.04 ns (51.77 ns .. 52.43 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 52.23 ns (52.02 ns .. 52.50 ns) std dev 790.3 ps (649.7 ps .. 981.1 ps) variance introduced by outliers: 19% (moderately inflated) benchmarking encode/25/base64 time 35.88 ns (35.50 ns .. 36.15 ns) 1.000 R² (0.999 R² .. 1.000 R²) mean 35.44 ns (35.28 ns .. 35.61 ns) std dev 609.5 ps (466.8 ps .. 835.9 ps) variance introduced by outliers: 23% (moderately inflated) benchmarking encode/100/base64-bytestring time 116.5 ns (115.6 ns .. 117.5 ns) 0.999 R² (0.999 R² .. 0.999 R²) mean 119.1 ns (117.9 ns .. 120.9 ns) std dev 4.946 ns (3.674 ns .. 6.871 ns) variance introduced by outliers: 62% (severely inflated) benchmarking encode/100/base64 time 54.59 ns (54.15 ns .. 54.97 ns) 1.000 R² (0.999 R² .. 1.000 R²) mean 54.84 ns (54.53 ns .. 55.11 ns) std dev 967.4 ps (759.0 ps .. 1.233 ns) variance introduced by outliers: 24% (moderately inflated) benchmarking encode/1k/base64-bytestring time 792.6 ns (789.2 ns .. 796.2 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 797.4 ns (794.2 ns .. 801.2 ns) std dev 12.70 ns (10.10 ns .. 16.66 ns) variance introduced by outliers: 17% (moderately inflated) benchmarking encode/1k/base64 time 300.4 ns (296.8 ns .. 304.4 ns) 0.998 R² (0.996 R² .. 1.000 R²) mean 294.8 ns (291.3 ns .. 301.3 ns) std dev 14.55 ns (9.522 ns .. 25.93 ns) variance introduced by outliers: 68% (severely inflated) benchmarking encode/10k/base64-bytestring time 7.852 μs (7.806 μs .. 7.917 μs) 0.999 R² (0.999 R² .. 1.000 R²) mean 7.849 μs (7.810 μs .. 7.923 μs) std dev 172.2 ns (88.74 ns .. 277.8 ns) variance introduced by outliers: 23% (moderately inflated) benchmarking encode/10k/base64 time 2.748 μs (2.724 μs .. 2.773 μs) 0.999 R² (0.998 R² .. 1.000 R²) mean 2.737 μs (2.717 μs .. 2.802 μs) std dev 108.8 ns (48.74 ns .. 219.4 ns) variance introduced by outliers: 53% (severely inflated) benchmarking encode/100k/base64-bytestring time 81.01 μs (80.45 μs .. 81.98 μs) 0.999 R² (0.996 R² .. 1.000 R²) mean 80.95 μs (80.48 μs .. 82.34 μs) std dev 2.561 μs (1.019 μs .. 4.866 μs) variance introduced by outliers: 31% (moderately inflated) benchmarking encode/100k/base64 time 26.20 μs (26.13 μs .. 26.29 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 26.32 μs (26.25 μs .. 26.40 μs) std dev 238.5 ns (190.6 ns .. 314.7 ns) benchmarking encode/1mm/base64-bytestring time 793.2 μs (791.7 μs .. 794.8 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 795.1 μs (794.0 μs .. 796.2 μs) std dev 3.646 μs (3.048 μs .. 4.402 μs) benchmarking encode/1mm/base64 time 266.1 μs (260.7 μs .. 273.8 μs) 0.997 R² (0.995 R² .. 1.000 R²) mean 262.2 μs (260.6 μs .. 265.8 μs) std dev 7.496 μs (1.432 μs .. 12.40 μs) variance introduced by outliers: 23% (moderately inflated) benchmarking decode/25/base64-bytestring time 59.26 ns (59.18 ns .. 59.35 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 59.31 ns (59.22 ns .. 59.41 ns) std dev 329.7 ps (270.9 ps .. 416.3 ps) benchmarking decode/25/base64-typed time 45.90 ns (45.78 ns .. 46.04 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 45.95 ns (45.88 ns .. 46.04 ns) std dev 261.9 ps (218.3 ps .. 327.6 ps) benchmarking decode/25/base64-untyped time 55.79 ns (55.63 ns .. 56.02 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 55.90 ns (55.77 ns .. 56.06 ns) std dev 470.0 ps (364.5 ps .. 692.0 ps) benchmarking decode/100/base64-bytestring time 153.8 ns (153.4 ns .. 154.2 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 153.6 ns (153.4 ns .. 153.9 ns) std dev 931.2 ps (780.6 ps .. 1.139 ns) benchmarking decode/100/base64-typed time 121.3 ns (120.6 ns .. 122.4 ns) 0.999 R² (0.997 R² .. 1.000 R²) mean 121.5 ns (120.6 ns .. 125.2 ns) std dev 4.717 ns (1.474 ns .. 10.23 ns) variance introduced by outliers: 59% (severely inflated) benchmarking decode/100/base64-untyped time 153.2 ns (152.9 ns .. 153.5 ns) 1.000 R² (1.000 R² .. 1.000 R²) mean 153.3 ns (153.1 ns .. 153.5 ns) std dev 642.7 ps (538.6 ps .. 804.8 ps) benchmarking decode/1k/base64-bytestring time 1.246 μs (1.244 μs .. 1.248 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.247 μs (1.245 μs .. 1.248 μs) std dev 4.807 ns (3.911 ns .. 5.909 ns) benchmarking decode/1k/base64-typed time 909.0 ns (902.6 ns .. 919.9 ns) 0.999 R² (0.998 R² .. 1.000 R²) mean 905.9 ns (902.1 ns .. 917.4 ns) std dev 19.73 ns (7.516 ns .. 39.01 ns) variance introduced by outliers: 27% (moderately inflated) benchmarking decode/1k/base64-untyped time 1.210 μs (1.192 μs .. 1.226 μs) 0.999 R² (0.999 R² .. 1.000 R²) mean 1.222 μs (1.214 μs .. 1.226 μs) std dev 19.44 ns (10.77 ns .. 29.88 ns) variance introduced by outliers: 16% (moderately inflated) benchmarking decode/10k/base64-bytestring time 11.56 μs (11.53 μs .. 11.59 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 11.49 μs (11.47 μs .. 11.52 μs) std dev 91.19 ns (77.80 ns .. 110.9 ns) benchmarking decode/10k/base64-typed time 8.140 μs (8.126 μs .. 8.157 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 8.141 μs (8.125 μs .. 8.169 μs) std dev 70.34 ns (47.20 ns .. 119.2 ns) benchmarking decode/10k/base64-untyped time 11.25 μs (11.24 μs .. 11.27 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 11.29 μs (11.27 μs .. 11.35 μs) std dev 102.4 ns (52.23 ns .. 185.3 ns) benchmarking decode/100k/base64-bytestring time 114.2 μs (113.9 μs .. 114.6 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 114.5 μs (114.3 μs .. 114.8 μs) std dev 778.0 ns (644.2 ns .. 997.4 ns) benchmarking decode/100k/base64-typed time 80.52 μs (80.37 μs .. 80.68 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 80.56 μs (80.44 μs .. 80.75 μs) std dev 478.9 ns (347.2 ns .. 750.1 ns) benchmarking decode/100k/base64-untyped time 111.0 μs (110.8 μs .. 111.2 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 111.4 μs (111.2 μs .. 111.8 μs) std dev 836.7 ns (409.8 ns .. 1.471 μs) benchmarking decode/1mm/base64-bytestring time 1.125 ms (1.122 ms .. 1.128 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.123 ms (1.121 ms .. 1.124 ms) std dev 5.065 μs (4.293 μs .. 6.589 μs) benchmarking decode/1mm/base64-typed time 804.8 μs (802.3 μs .. 807.7 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 802.7 μs (802.0 μs .. 803.8 μs) std dev 2.940 μs (1.813 μs .. 4.985 μs) benchmarking decode/1mm/base64-untyped time 1.108 ms (1.106 ms .. 1.110 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.108 ms (1.107 ms .. 1.110 ms) std dev 5.673 μs (4.383 μs .. 7.451 μs) ``` Benchmarks are included in this repo for you to reproduce these results on your own. You can see a parity in the `encode` step between the previous library iterations and the new epoch, with a *marked* improvement in decode speed (up to 25% faster on average between the old and new versions in the optimal case, and up to 40% in the suboptimal case) which justifies the performance aspect to me. Without deferring to pipelining instructions, hex encoding can only get so fast. In the future, this change also opens the library up to an optimal SIMD implementations. ## A sounder api Second, I do not believe that these changes are unsound or overburdensome to the point that a migration to the new paradigm would be untenable. While it may be inconvenient to unwrap `Base64` types, in the `encode` case (all one must do is call `extractBase64` to extract the value from its wrapper, all caveats implied), and in the case of `decode`, an untyped variant is supplied, and is semantically consistent with the old behavior (the loop is the same). Hence, a migration is fairly easy to sketch out: ``` "encodeBase64'" -> "extractBase64 . encodeBase64'" "encodeBase64" -> "extractBase64 . encodeBase64" "decodebase64" -> "decodeBase64Untyped" "decodeBase64Unpadded" -> "decodeBase64UnpaddedUntyped" "decodeBase64Padded" -> "decodeBase64PaddedUntyped" "decodeBase64W*With" -> "decodeBase64*WithUntyped" ``` And that is all. In order to make use of the new loops, one must only use one of the blessed encode functions to generate a wrapped `Base64` value, or call `assertBase64` and proceed with using `decodeBase64` as usual in order to decode. You'll note that an untyped `encodeBase64` is not supplied, and this is due to the fact that it's trivial to extract a `Base64` encoded value once you have it. However, I want to encourage people to use the new API, so I have only supplied a decode with error checking in the untyped case, because sometimes we deal with other people's data and cannot establish provenance. In the encode case, I would rather keep that provenance a part of the API, and the user may opt to strip that data upon sending to others or around their systems. It's not my problem at that point! base64-1.0/README.md0000644000000000000000000000204307346545000012105 0ustar0000000000000000# Base64 ![Build Status](https://github.com/emilypi/base64/workflows/ci/badge.svg) [![Hackage](https://img.shields.io/hackage/v/base64.svg)](https://hackage.haskell.org/package/base64) Base64 encoding and decodings. For the companion optics and pattern synonyms, see [base64-lens](https://hackage.haskell.org/package/base64-lens). ### Summary The following types are supported for both std, padded url-safe, and unpadded url-safe alphabets: - `Data.ByteString` - `Data.ByteString.Lazy` - `Data.ByteString.Short` - `Data.Text` - `Data.Text.Lazy` - `Data.Text.Short` Additionally this library has - Better performance than `base64-bytestring` for encode and decode. - Optics for handling more complex structures with Base64 representations via the `base64-lens` package - Checks for both validity and correctness of Base64 and Base64url encodings - Rejects non-canonical encodings that do not roundtrip in other base64 libraries like `ZE==`. There are no dependencies aside from those bundled with GHC, `text-short`, and the `ghc-byteorder` re-export. base64-1.0/Setup.hs0000644000000000000000000000005607346545000012264 0ustar0000000000000000import Distribution.Simple main = defaultMain base64-1.0/base64.cabal0000644000000000000000000000545507346545000012710 0ustar0000000000000000cabal-version: 2.0 name: base64 version: 1.0 synopsis: A modern Base64 library description: A performant, featureful RFC 4648 and 7049-compliant Base64 implementation homepage: https://github.com/emilypi/base64 bug-reports: https://github.com/emilypi/base64/issues license: BSD3 license-file: LICENSE author: Emily Pillmore maintainer: Emily Pillmore , Sofia-m-a copyright: (c) 2019-2023 Emily Pillmore category: Data build-type: Simple extra-doc-files: CHANGELOG.md README.md MIGRATION-1.0.md tested-with: GHC ==8.10.7 || ==9.0.2 || ==9.2.5 || ==9.4.7 || ==9.6.3 || ==9.8.1 source-repository head type: git location: https://github.com/emilypi/base64.git library exposed-modules: Data.Base64.Types Data.ByteString.Base64 Data.ByteString.Base64.URL Data.ByteString.Lazy.Base64 Data.ByteString.Lazy.Base64.URL Data.ByteString.Short.Base64 Data.ByteString.Short.Base64.URL Data.Text.Encoding.Base64 Data.Text.Encoding.Base64.Error Data.Text.Encoding.Base64.URL Data.Text.Lazy.Encoding.Base64 Data.Text.Lazy.Encoding.Base64.URL Data.Text.Short.Encoding.Base64 Data.Text.Short.Encoding.Base64.URL other-modules: Data.Base64.Types.Internal Data.ByteString.Base64.Internal Data.ByteString.Base64.Internal.Head Data.ByteString.Base64.Internal.Tables Data.ByteString.Base64.Internal.Tail Data.ByteString.Base64.Internal.Utils Data.ByteString.Base64.Internal.W16.Loop Data.ByteString.Base64.Internal.W64.Loop build-depends: base >=4.14 && <4.20 , bytestring >=0.11 && <0.13 , deepseq >=1.4.4.0 && <1.6 , text >=2.0 && <2.3 , text-short ^>=0.1 hs-source-dirs: src default-language: Haskell2010 ghc-options: -Wall test-suite base64-tests default-language: Haskell2010 type: exitcode-stdio-1.0 hs-source-dirs: test other-modules: Internal main-is: Main.hs build-depends: base >=4.14 && <4.20 , base64 , base64-bytestring , bytestring >=0.11 , QuickCheck , random-bytestring , tasty , tasty-hunit , tasty-quickcheck , text >=2.0 , text-short ghc-options: -Wall -threaded -with-rtsopts=-N benchmark bench default-language: Haskell2010 type: exitcode-stdio-1.0 hs-source-dirs: benchmarks main-is: Base64Bench.hs build-depends: base >=4.14 && <4.20 , base64 , base64-bytestring , bytestring >=0.11 , criterion , deepseq , random-bytestring , text >=2.0 ghc-options: -Wall -rtsopts base64-1.0/benchmarks/0000755000000000000000000000000007346545000012744 5ustar0000000000000000base64-1.0/benchmarks/Base64Bench.hs0000644000000000000000000000723307346545000015271 0ustar0000000000000000{-# LANGUAGE AllowAmbiguousTypes #-} {-# LANGUAGE DataKinds #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE PackageImports #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE TypeFamilies #-} module Main ( main ) where import Criterion import Criterion.Main import "base64-bytestring" Data.ByteString.Base64 as Bos import "base64" Data.ByteString.Base64 as B64 import "base64" Data.Base64.Types as B64 import Data.ByteString.Random (random) main :: IO () main = defaultMain [ env bs $ \ ~(bs25,bs100,bs1k,bs10k,bs100k,bs1mm) -> bgroup "encode" [ bgroup "25" [ bench "base64-bytestring" $ whnf Bos.encode bs25 , bench "base64" $ whnf B64.encodeBase64' bs25 ] , bgroup "100" [ bench "base64-bytestring" $ whnf Bos.encode bs100 , bench "base64" $ whnf B64.encodeBase64' bs100 ] , bgroup "1k" [ bench "base64-bytestring" $ whnf Bos.encode bs1k , bench "base64" $ whnf B64.encodeBase64' bs1k ] , bgroup "10k" [ bench "base64-bytestring" $ whnf Bos.encode bs10k , bench "base64" $ whnf B64.encodeBase64' bs10k ] , bgroup "100k" [ bench "base64-bytestring" $ whnf Bos.encode bs100k , bench "base64" $ whnf B64.encodeBase64' bs100k ] , bgroup "1mm" [ bench "base64-bytestring" $ whnf Bos.encode bs1mm , bench "base64" $ whnf B64.encodeBase64' bs1mm ] ] , env bs' $ \ ~(bs25,bs100,bs1k,bs10k,bs100k,bs1mm) -> bgroup "decode" [ bgroup "25" [ bench "base64-bytestring" $ whnf Bos.decode (B64.extractBase64 bs25) , bench "base64-typed" $ whnf B64.decodeBase64 bs25 , bench "base64-untyped" $ whnf B64.decodeBase64Untyped (B64.extractBase64 bs25) ] , bgroup "100" [ bench "base64-bytestring" $ whnf Bos.decode (B64.extractBase64 bs100) , bench "base64-typed" $ whnf B64.decodeBase64 bs100 , bench "base64-untyped" $ whnf B64.decodeBase64Untyped (B64.extractBase64 bs100) ] , bgroup "1k" [ bench "base64-bytestring" $ whnf Bos.decode (B64.extractBase64 bs1k) , bench "base64-typed" $ whnf B64.decodeBase64 bs1k , bench "base64-untyped" $ whnf B64.decodeBase64Untyped (B64.extractBase64 bs1k) ] , bgroup "10k" [ bench "base64-bytestring" $ whnf Bos.decode (B64.extractBase64 bs10k) , bench "base64-typed" $ whnf B64.decodeBase64 bs10k , bench "base64-untyped" $ whnf B64.decodeBase64Untyped (B64.extractBase64 bs10k) ] , bgroup "100k" [ bench "base64-bytestring" $ whnf Bos.decode (B64.extractBase64 bs100k) , bench "base64-typed" $ whnf B64.decodeBase64 bs100k , bench "base64-untyped" $ whnf B64.decodeBase64Untyped (B64.extractBase64 bs100k) ] , bgroup "1mm" [ bench "base64-bytestring" $ whnf Bos.decode (B64.extractBase64 bs1mm) , bench "base64-typed" $ whnf B64.decodeBase64 bs1mm , bench "base64-untyped" $ whnf B64.decodeBase64Untyped (B64.extractBase64 bs1mm) ] ] ] where bs = do a <- random 25 b <- random 100 c <- random 1000 d <- random 10000 e <- random 100000 f <- random 1000000 return (a,b,c,d,e,f) bs' = do a <- B64.encodeBase64' <$> random 25 b <- B64.encodeBase64' <$> random 100 c <- B64.encodeBase64' <$> random 1000 d <- B64.encodeBase64' <$> random 10000 e <- B64.encodeBase64' <$> random 100000 f <- B64.encodeBase64' <$> random 1000000 return (a,b,c,d,e,f) base64-1.0/src/Data/Base64/0000755000000000000000000000000007346545000013313 5ustar0000000000000000base64-1.0/src/Data/Base64/Types.hs0000644000000000000000000000712007346545000014753 0ustar0000000000000000{-# language DataKinds #-} {-# language PolyKinds #-} {-# language RankNTypes #-} {-# language TypeFamilies #-} {-# language UndecidableInstances #-} -- | -- Module : Data.ByteString.Base64.Types -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore , -- sofia-m-a -- Stability : stable -- Portability : non-portable -- -- This module contains the 'Base64' type definition, 'Alphabet' -- datatype, alphabet constraints, and various quality of life -- combinators for working with 'Base64'-wrapped data. -- module Data.Base64.Types ( Alphabet(..) , type Base64 , assertBase64 , extractBase64 , coerceBase64 , type UrlAlphabet , type StdAlphabet , type NonStandardAlphabet ) where import Data.Base64.Types.Internal (Alphabet(..), Base64(..)) import Data.Coerce (coerce) import Data.Kind import GHC.TypeLits -- | Assert the provenance of a value encoded in a particular -- base64 alphabet. -- -- /Warning/: This is a blind assertion that a particular -- value is base64 encoded in some alphabet. If you are not -- sure of the provenance of the value, you may experience -- odd behavior when attempting to decode. Use at your own -- risk. If I see any issues logged on this project from -- negligent use of this or 'coerceBase64', I will -- smite you. -- assertBase64 :: forall k a. a -> Base64 k a assertBase64 = Base64 -- | Forget the provenance of a base64-encoded value. -- extractBase64 :: Base64 k a -> a extractBase64 (Base64 a) = a -- | Coerce the alphabet of a base64-encoded bytestring -- -- /Warning/: This is a blind assertion that a particular -- value is base64 encoded in some alphabet. If you are not -- sure of the provenance of the value, you may experience -- odd behavior when attempting to decode. Use at your own -- risk. If I see any issues logged on this project from -- negligent use of this or 'assertBase64', I will -- smite you. -- coerceBase64 :: forall j k a. Base64 k a -> Base64 j a coerceBase64 = coerce -- | The type family of Url-safe alphabets -- -- This type family defines the union of compatible Url-safe base64 types. -- To write a function that is parametric over such types, issue a -- constraint like `forall k. UrlAlphabet k`. -- type family UrlAlphabet k :: Constraint where UrlAlphabet 'UrlPadded = () UrlAlphabet 'UrlUnpadded = () UrlAlphabet _ = TypeError ( 'Text "Cannot prove base64 value is encoded using the url-safe \ \alphabet. Please re-encode using the url-safe encoders, or use \ \a lenient decoder for the url-safe alphabet instead." ) -- | The type family of standard alphabets -- -- This type family defines the union of compatible standard -- alphabet base64 types -- type family StdAlphabet k :: Constraint where StdAlphabet 'StdPadded = () StdAlphabet _ = TypeError ( 'Text "Cannot prove base64 value is encoded using the std \ \alphabet. Please re-encode using the std encoders, or use \ \a lenient decoder for the std alphabet instead." ) -- | The type family of non-standard alphabets -- -- Only untyped variants of encodeBase64/decodeBase64 can interact with this -- type family, in addition to assertion\/coercion\/extraction of -- these types of values. -- type family NonStandardAlphabet k :: Constraint where NonStandardAlphabet 'NonStandard = () NonStandardAlphabet _ = TypeError ( 'Text "Cannot prove base64 value is encoded using a non-standard \ \alphabet. Please re-encode and assert/coerce the alphabet, \ \and use a lenient decoder." ) base64-1.0/src/Data/Base64/Types/0000755000000000000000000000000007346545000014417 5ustar0000000000000000base64-1.0/src/Data/Base64/Types/Internal.hs0000644000000000000000000000437207346545000016535 0ustar0000000000000000{-# language DataKinds #-} {-# language RankNTypes #-} {-# language TypeFamilies #-} -- | -- Module : Data.ByteString.Base64.Types.Internal -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore , -- sofia-m-a -- Stability : stable -- Portability : non-portable -- -- This module contains the 'Base64' newtype, 'Alphabet' -- datatype, and associated instances for 'Base64'. -- module Data.Base64.Types.Internal ( Alphabet(..) , Base64(..) ) where import Control.DeepSeq (NFData, rnf) -- | The different kinds of supported Base64 encodings data Alphabet = StdPadded -- ^ Standard base64 according to [RFC 4648 §4](https://datatracker.ietf.org/doc/html/rfc4648#section-4) -- Padding is always inserted when encoding, and required when decoding | UrlPadded -- ^ Standard base64 according to [RFC 4648 §4](https://datatracker.ietf.org/doc/html/rfc4648#section-4) -- Padding is never inserted when encoding, and optional when decoding per -- [RFC 7049](https://datatracker.ietf.org/doc/html/rfc7049#section-2.4.4.2). | UrlUnpadded -- ^ URL-safe base64 according to [RFC 4648 §5](https://datatracker.ietf.org/doc/html/rfc4648#section-5) aka base64url -- Padding is never inserted when encoding, and optional when decoding | NonStandard -- ^ Any non-standard, non RFC 4648-compliant base64 encoding. -- Can only be decoded using lenient decoders. -- | Wraps a value, asserting that it is or is intended to be -- in a particular kind of Base64 encoding use 'extractBase64' -- to extract the value, and 'assertBase64' to tag a value -- as base64-encoded -- newtype Base64 (k :: Alphabet) a = Base64 a instance forall k. Functor (Base64 k) where fmap f (Base64 a) = Base64 (f a) instance forall k a. (Eq a) => Eq (Base64 k a) where Base64 a == Base64 b = a == b instance forall k. Applicative (Base64 k) where pure = Base64 Base64 f <*> Base64 a = Base64 (f a) instance forall k. Monad (Base64 k) where return = pure Base64 a >>= k = k a instance forall k a. (Show a) => Show (Base64 k a) where show (Base64 a) = show a instance forall k a. NFData a => NFData (Base64 k a) where rnf (Base64 a) = rnf a base64-1.0/src/Data/ByteString/0000755000000000000000000000000007346545000014361 5ustar0000000000000000base64-1.0/src/Data/ByteString/Base64.hs0000644000000000000000000001133607346545000015745 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE DataKinds #-} {-# LANGUAGE OverloadedStrings #-} -- | -- Module : Data.ByteString.Base64 -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.ByteString.ByteString'-valued combinators for -- implementing the RFC 4648 specification of the Base64 -- encoding format. This includes lenient decoding variants, as well as -- internal and external validation for canonicity. -- module Data.ByteString.Base64 ( -- * Encoding encodeBase64 , encodeBase64' -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64Lenient -- * Validation , isBase64 , isValidBase64 ) where import Data.Base64.Types import Data.ByteString.Internal (ByteString(..)) import Data.ByteString.Base64.Internal import Data.ByteString.Base64.Internal.Head import Data.ByteString.Base64.Internal.Tables import Data.Either (isRight) import Data.Text (Text) import qualified Data.Text.Encoding as T import System.IO.Unsafe -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'ByteString' value as Base64 'Text' with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "Sun" -- "U3Vu" -- encodeBase64 :: ByteString -> Base64 'StdPadded Text encodeBase64 = fmap T.decodeUtf8 . encodeBase64' {-# inline encodeBase64 #-} -- | Encode a 'ByteString' value as a Base64 'ByteString' value with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64' "Sun" -- "U3Vu" -- encodeBase64' :: ByteString -> Base64 'StdPadded ByteString encodeBase64' = assertBase64 . encodeBase64_ base64Table {-# inline encodeBase64' #-} -- | Decode a padded Base64-encoded 'ByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'StdPadded "U3Vu" -- "Sun" -- decodeBase64 :: StdAlphabet k => Base64 k ByteString -> ByteString decodeBase64 = decodeBase64Typed_ decodeB64Table {-# inline decodeBase64 #-} -- | Decode a padded untyped Base64-encoded 'ByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "U3Vu" -- Right "Sun" -- -- >>> decodeBase64Untyped "U3V" -- Left "Base64-encoded bytestring requires padding" -- -- >>> decodeBase64Untyped "U3V=" -- Left "non-canonical encoding detected at offset: 2" -- decodeBase64Untyped :: ByteString -> Either Text ByteString decodeBase64Untyped bs@(PS _ _ !l) | l == 0 = Right mempty | r == 1 = Left "Base64-encoded bytestring has invalid size" | r /= 0 = Left "Base64-encoded bytestring requires padding" | otherwise = unsafeDupablePerformIO $ decodeBase64_ decodeB64Table bs where !r = l `rem` 4 {-# inline decodeBase64Untyped #-} -- | Leniently decode an untyped Base64-encoded 'ByteString' value. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "U3Vu" -- "Sun" -- -- >>> decodeBase64Lenient "U3V" -- "Su" -- -- >>> decodeBase64Lenient "U3V=" -- "Su" -- decodeBase64Lenient :: ByteString -> ByteString decodeBase64Lenient = decodeBase64Lenient_ decodeB64Table {-# inline decodeBase64Lenient #-} -- | Tell whether a 'ByteString' value is base64 encoded. -- -- This function will also detect non-canonical encodings such as @ZE==@, which are -- externally valid Base64-encoded values, but are internally inconsistent "impossible" -- values. -- -- === __Examples__: -- -- >>> isBase64 "U3Vu" -- True -- -- >>> isBase64 "U3V" -- False -- -- >>> isBase64 "U3V=" -- False -- isBase64 :: ByteString -> Bool isBase64 bs = isValidBase64 bs && isRight (decodeBase64Untyped bs) {-# inline isBase64 #-} -- | Tell whether a 'ByteString' value is a valid Base64 format. -- -- This will not tell you whether or not this is a correct Base64 representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'ByteString' value, use 'isBase64'. -- -- === __Examples__: -- -- >>> isValidBase64 "U3Vu" -- True -- -- >>> isValidBase64 "U3V" -- True -- -- >>> isValidBase64 "U3V=" -- True -- -- >>> isValidBase64 "%" -- False -- isValidBase64 :: ByteString -> Bool isValidBase64 = validateBase64 "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/" {-# inline isValidBase64 #-} base64-1.0/src/Data/ByteString/Base64/0000755000000000000000000000000007346545000015405 5ustar0000000000000000base64-1.0/src/Data/ByteString/Base64/Internal.hs0000644000000000000000000000675507346545000017532 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE OverloadedStrings #-} -- | -- Module : Data.ByteString.Base64.Internal -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : Experimental -- Portability : portable -- -- Internal module defining the encoding and decoding -- processes and tables. -- module Data.ByteString.Base64.Internal ( validateBase64 , validateBase64Url , validateLastPad ) where import qualified Data.ByteString as BS import Data.ByteString.Internal import Data.Text (Text) import Foreign.ForeignPtr import Foreign.Ptr import Foreign.Storable import System.IO.Unsafe -- | Given a bytestring, check to see that it conforms to a given alphabet -- validateBase64 :: ByteString -> ByteString -> Bool validateBase64 !alphabet (PS !fp !off !l) = accursedUnutterablePerformIO $ withForeignPtr fp $ \p -> go (plusPtr p off) (plusPtr p (l + off)) where go !p !end | p == end = return True | otherwise = do w <- peek p let f a | a == 0x3d, plusPtr p 1 == end = True | a == 0x3d, plusPtr p 2 == end = True | a == 0x3d = False | otherwise = BS.elem a alphabet if f w then go (plusPtr p 1) end else return False {-# INLINE validateBase64 #-} validateBase64Url :: ByteString -> ByteString -> Bool validateBase64Url !alphabet bs@(PS _ _ l) | l == 0 = True | r == 0 = f bs | r == 2 = f (BS.append bs "==") | r == 3 = f (BS.append bs "=") | otherwise = False where r = l `rem` 4 f (PS fp o n) = accursedUnutterablePerformIO $ withForeignPtr fp $ \p -> go (plusPtr p o) (plusPtr p (n + o)) go !p !end | p == end = return True | otherwise = do w <- peek p let check a | a == 0x3d, plusPtr p 1 == end = True | a == 0x3d, plusPtr p 2 == end = True | a == 0x3d = False | otherwise = BS.elem a alphabet if check w then go (plusPtr p 1) end else return False {-# INLINE validateBase64Url #-} -- | This function checks that the last char of a bytestring is '=' -- and, if true, fails with a message or completes some io action. -- -- This is necessary to check when decoding permissively (i.e. filling in padding chars). -- Consider the following 4 cases of a string of length l: -- -- l = 0 mod 4: No pad chars are added, since the input is assumed to be good. -- l = 1 mod 4: Never an admissible length in base64 -- l = 2 mod 4: 2 padding chars are added. If padding chars are present in the string, they will fail as to decode as final quanta -- l = 3 mod 4: 1 padding char is added. In this case a string is of the form + . If adding the -- pad char "completes"" the string so that it is `l = 0 mod 4`, then this may possibly be forming corrupting data. -- This case is degenerate and should be disallowed. -- -- Hence, permissive decodes should only fill in padding chars when it makes sense to add them. That is, -- if an input is degenerate, it should never succeed when we add padding chars. We need the following invariant to hold: -- -- @ -- B64U.decodeUnpadded <|> B64U.decodePadded ~ B64U.decodePadded -- @ -- validateLastPad :: ByteString -> IO (Either Text ByteString) -> Either Text ByteString validateLastPad bs io | BS.last bs == 0x3d = Left "Base64-encoded bytestring has invalid padding" | otherwise = unsafeDupablePerformIO io {-# INLINE validateLastPad #-} base64-1.0/src/Data/ByteString/Base64/Internal/0000755000000000000000000000000007346545000017161 5ustar0000000000000000base64-1.0/src/Data/ByteString/Base64/Internal/Head.hs0000644000000000000000000001052107346545000020355 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE CPP #-} {-# LANGUAGE OverloadedStrings #-} -- | -- Module : Data.ByteString.Base64.Internal.Head -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : Experimental -- Portability : portable -- -- Shared internal utils -- module Data.ByteString.Base64.Internal.Head ( encodeBase64_ , encodeBase64Nopad_ , decodeBase64_ , decodeBase64Typed_ , decodeBase64Lenient_ ) where import Data.Base64.Types.Internal import Data.ByteString.Base64.Internal.Tail import Data.ByteString.Base64.Internal.Utils import Data.ByteString.Base64.Internal.W64.Loop import Data.ByteString.Internal import Data.Text (Text) import Foreign.ForeignPtr import Foreign.Ptr import GHC.ForeignPtr import GHC.Word import System.IO.Unsafe ( unsafeDupablePerformIO ) encodeBase64_ :: EncodingTable -> ByteString -> ByteString encodeBase64_ (EncodingTable !aptr !efp) (PS !sfp !soff !slen) = unsafeDupablePerformIO $ do dfp <- mallocPlainForeignPtrBytes dlen withForeignPtr dfp $ \dptr -> withForeignPtr sfp $ \sptr -> withForeignPtr efp $ \eptr -> do let !end = plusPtr sptr (soff + slen) innerLoop eptr (castPtr (plusPtr sptr soff)) (castPtr dptr) end (loopTail dfp dptr aptr (castPtr end)) where !dlen = 4 * ((slen + 2) `div` 3) encodeBase64Nopad_ :: EncodingTable -> ByteString -> ByteString encodeBase64Nopad_ (EncodingTable !aptr !efp) (PS !sfp !soff !slen) = unsafeDupablePerformIO $ do dfp <- mallocPlainForeignPtrBytes dlen withForeignPtr dfp $ \dptr -> withForeignPtr efp $ \etable -> withForeignPtr sfp $ \sptr -> do let !end = plusPtr sptr (soff + slen) innerLoop etable (castPtr (plusPtr sptr soff)) (castPtr dptr) end (loopTailNoPad dfp aptr dptr (castPtr end)) where !dlen = 4 * ((slen + 2) `div` 3) -- | The main decode function. Takes a padding flag, a decoding table, and -- the input value, producing either an error string on the left, or a -- decoded value. -- -- Note: If 'Padding' ~ 'Don\'tCare', then we pad out the input to a multiple of 4. -- If 'Padding' ~ 'Padded', then we do not, and fail if the input is not -- a multiple of 4 in length. If 'Padding' ~ 'Unpadded', then we validate -- correctness of length and the absence of padding and then treat as a std -- padded string. -- decodeBase64_ :: ForeignPtr Word8 -> ByteString -> IO (Either Text ByteString) decodeBase64_ !dtfp (PS !sfp !soff !slen) = withForeignPtr dtfp $ \dtable -> withForeignPtr sfp $ \sptr -> do dfp <- mallocPlainForeignPtrBytes dlen withForeignPtr dfp $ \dptr -> do let !end = plusPtr sptr (soff + slen) decodeLoop dtable (plusPtr sptr soff) dptr end dfp where !dlen = (slen `quot` 4) * 3 {-# inline decodeBase64_ #-} -- | The main decode function for typed base64 values. -- -- This loop is separate from 'decodeBase64_' due to the fact that -- when taking a 'Base64' value from this library, the existence -- of the wrapper is a witness to the well-formedness of the underlying value, -- and so we can eschew error checking in the decode loop. -- decodeBase64Typed_ :: ForeignPtr Word8 -> Base64 k ByteString -> ByteString decodeBase64Typed_ !dtfp (Base64 (PS sfp soff slen)) | slen == 0 = mempty | otherwise = unsafeDupablePerformIO $ withForeignPtr dtfp $ \dtable -> withForeignPtr sfp $ \sptr -> do dfp <- mallocPlainForeignPtrBytes dlen withForeignPtr dfp $ \dptr -> do let !end = plusPtr sptr (soff + slen) decodeLoopNoError dtable (plusPtr sptr soff) dptr end dfp where !dlen = (slen `quot` 4) * 3 {-# inline decodeBase64Typed_ #-} decodeBase64Lenient_ :: ForeignPtr Word8 -> ByteString -> ByteString decodeBase64Lenient_ !dtfp (PS !sfp !soff !slen) = unsafeDupablePerformIO $ withForeignPtr dtfp $ \dtable -> withForeignPtr sfp $ \sptr -> do dfp <- mallocPlainForeignPtrBytes dlen withForeignPtr dfp $ \dptr -> lenientLoop dtable (plusPtr sptr soff) dptr (plusPtr sptr (soff + slen)) dfp where !dlen = ((slen + 3) `div` 4) * 3 base64-1.0/src/Data/ByteString/Base64/Internal/Tables.hs0000644000000000000000000001002207346545000020722 0ustar0000000000000000{-# LANGUAGE MagicHash #-} {-# LANGUAGE TypeApplications #-} -- | -- Module : Data.ByteString.Base64.Internal.Tables -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : Experimental -- Portability : portable -- -- Shared internal tables -- module Data.ByteString.Base64.Internal.Tables ( base64Table , base64UrlTable , decodeB64Table , decodeB64UrlTable ) where import Data.ByteString.Base64.Internal.Utils import Foreign.ForeignPtr import GHC.Word -- | Base64url encoding table -- base64UrlTable :: EncodingTable base64UrlTable = packTable "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"# {-# NOINLINE base64UrlTable #-} -- | Base64 std encoding table -- base64Table :: EncodingTable base64Table = packTable "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"# {-# NOINLINE base64Table #-} -- | Non-URLsafe b64 decoding table (naive) -- decodeB64Table :: ForeignPtr Word8 decodeB64Table = writeNPlainForeignPtrBytes @Word8 256 [ 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0x3e,0xff,0xff,0xff,0x3f , 0x34,0x35,0x36,0x37,0x38,0x39,0x3a,0x3b,0x3c,0x3d,0xff,0xff,0xff,0x63,0xff,0xff , 0xff,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e , 0x0f,0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0xff,0xff,0xff,0xff,0xff , 0xff,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f,0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28 , 0x29,0x2a,0x2b,0x2c,0x2d,0x2e,0x2f,0x30,0x31,0x32,0x33,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff ] {-# NOINLINE decodeB64Table #-} -- | URLsafe b64 decoding table (naive) -- decodeB64UrlTable :: ForeignPtr Word8 decodeB64UrlTable = writeNPlainForeignPtrBytes @Word8 256 [ 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0x3e,0xff,0xff , 0x34,0x35,0x36,0x37,0x38,0x39,0x3a,0x3b,0x3c,0x3d,0xff,0xff,0xff,0x63,0xff,0xff , 0xff,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e , 0x0f,0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0xff,0xff,0xff,0xff,0x3f , 0xff,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f,0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28 , 0x29,0x2a,0x2b,0x2c,0x2d,0x2e,0x2f,0x30,0x31,0x32,0x33,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff , 0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff ] {-# NOINLINE decodeB64UrlTable #-} base64-1.0/src/Data/ByteString/Base64/Internal/Tail.hs0000644000000000000000000000563607346545000020420 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE TypeApplications #-} -- | -- Module : Data.ByteString.Base64.Internal.Tail -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : Experimental -- Portability : portable -- -- Finalizers for the encoding loop -- module Data.ByteString.Base64.Internal.Tail ( loopTail , loopTailNoPad ) where import Data.Bits import Data.ByteString.Internal import Data.ByteString.Base64.Internal.Utils import Foreign.ForeignPtr import Foreign.Ptr import Foreign.Storable import GHC.Exts import GHC.Word -- | Finalize an encoded bytestring by filling in the remaining -- bytes and any padding -- loopTail :: ForeignPtr Word8 -> Ptr Word8 -> Ptr Word8 -> Ptr Word8 -> Ptr Word8 -> Ptr Word8 -> IO ByteString loopTail !dfp !dptr (Ptr !alpha) !end !sptr_ !dptr_ = go sptr_ dptr_ where go src dst | src == end = pure $ PS dfp 0 (dst `minusPtr` dptr) | otherwise = do !x <- peek @Word8 src let !a = unsafeShiftR x 2 !carry0 = unsafeShiftL (x .&. 0x03) 4 -- poke first 6 bits poke @Word8 dst $ aix a alpha if src `plusPtr` 1 == end then do -- if no other bytes, poke carry bits poke @Word8 (dst `plusPtr` 1) $ aix carry0 alpha poke @Word8 (dst `plusPtr` 2) 0x3d poke @Word8 (dst `plusPtr` 3) 0x3d else do !y <- peek @Word8 $ src `plusPtr` 1 let !b = carry0 .|. unsafeShiftR (y .&. 0xf0) 4 !carry1 = unsafeShiftL (y .&. 0x0f) 2 poke @Word8 (dst `plusPtr` 1) $ aix b alpha poke @Word8 (dst `plusPtr` 2) $ aix carry1 alpha poke @Word8 (dst `plusPtr` 3) 0x3d pure $ PS dfp 0 (4 + minusPtr dst dptr) {-# inline loopTail #-} -- | Finalize a bytestring by filling out the remaining bits -- without padding. -- loopTailNoPad :: ForeignPtr Word8 -> Ptr Word8 -> Ptr Word8 -> Ptr Word8 -> Ptr Word8 -> Ptr Word8 -> IO ByteString loopTailNoPad !dfp (Ptr !alpha) !dptr !end !src !dst | src == end = return (PS dfp 0 (minusPtr dst dptr)) | plusPtr src 1 == end = do !x <- peek @Word8 src let !a = unsafeShiftR (x .&. 0xfc) 2 !b = unsafeShiftL (x .&. 0x03) 4 poke @Word8 dst (aix a alpha) poke @Word8 (plusPtr dst 1) (aix b alpha) return (PS dfp 0 (2 + minusPtr dst dptr)) | otherwise = do !x <- peek @Word8 src !y <- peek @Word8 (plusPtr src 1) let !a = unsafeShiftR (x .&. 0xfc) 2 !b = unsafeShiftL (x .&. 0x03) 4 let !c = unsafeShiftR (y .&. 0xf0) 4 .|. b !d = unsafeShiftL (y .&. 0x0f) 2 poke @Word8 dst (aix a alpha) poke @Word8 (plusPtr dst 1) (aix c alpha) poke @Word8 (plusPtr dst 2) (aix d alpha) return (PS dfp 0 (3 + minusPtr dst dptr)) {-# inline loopTailNoPad #-} base64-1.0/src/Data/ByteString/Base64/Internal/Utils.hs0000644000000000000000000000741307346545000020622 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE MagicHash #-} -- | -- Module : Data.ByteString.Base64.Internal -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : Experimental -- Portability : portable -- -- Shared internal utils -- module Data.ByteString.Base64.Internal.Utils ( EncodingTable(..) , aix , mask_2bits , mask_4bits , packTable , peekWord32BE , peekWord64BE , reChunkN , validateLastPos , w32 , w64 , w32_16 , w64_16 , writeNPlainForeignPtrBytes ) where import Data.Bits import Data.ByteString (ByteString) import qualified Data.ByteString as BS import Foreign.ForeignPtr import Foreign.Ptr import Foreign.Storable import GHC.ByteOrder import GHC.Exts import GHC.ForeignPtr import GHC.Word import System.IO.Unsafe -- | Only the lookup table need be a foreignptr, -- and then, only so that we can automate some touches to keep it alive -- data EncodingTable = EncodingTable {-# UNPACK #-} !(Ptr Word8) {-# UNPACK #-} !(ForeignPtr Word16) -- | Read 'Word8' index off alphabet addr -- aix :: Word8 -> Addr# -> Word8 aix w8 alpha = W8# (indexWord8OffAddr# alpha i) where !(I# i) = fromIntegral w8 {-# INLINE aix #-} -- | Convert 'Word8''s into 'Word32''s -- w32 :: Word8 -> Word32 w32 = fromIntegral {-# INLINE w32 #-} -- | Convert 'Word8''s into 'Word32''s -- w64 :: Word8 -> Word64 w64 = fromIntegral {-# INLINE w64 #-} -- | Convert 'Word8''s into 'Word32''s -- w64_16 :: Word16 -> Word64 w64_16 = fromIntegral {-# INLINE w64_16 #-} w32_16 :: Word16 -> Word32 w32_16 = fromIntegral {-# INLINE w32_16 #-} -- | Mask bottom 2 bits -- mask_2bits :: Word8 mask_2bits = 3 -- (1 << 2) - 1 {-# INLINE mask_2bits #-} -- | Mask bottom 4 bits -- mask_4bits :: Word8 mask_4bits = 15 -- (1 << 4) - 1 {-# INLINE mask_4bits #-} -- | Validate some ptr index against some bitmask -- validateLastPos :: Word32 -> Word8 -> Bool validateLastPos pos mask = (fromIntegral pos .&. mask) == 0 {-# INLINE validateLastPos #-} -- | Allocate and fill @n@ bytes with some data -- writeNPlainForeignPtrBytes :: ( Storable a , Storable b ) => Int -> [a] -> ForeignPtr b writeNPlainForeignPtrBytes !n as = unsafeDupablePerformIO $ do fp <- mallocPlainForeignPtrBytes n withForeignPtr fp $ \p -> go p as return (castForeignPtr fp) where go !_ [] = return () go !p (x:xs) = poke p x >> go (plusPtr p 1) xs -- | Pack an 'Addr#' into an encoding table of 'Word16's -- packTable :: Addr# -> EncodingTable packTable alphabet = etable where ix (I# n) = W8# (indexWord8OffAddr# alphabet n) !etable = let bs = concat [ [ ix i, ix j ] | !i <- [0..63] , !j <- [0..63] ] in EncodingTable (Ptr alphabet) (writeNPlainForeignPtrBytes 8192 bs) -- | Rechunk a list of bytestrings in multiples of 4 -- reChunkN :: Int -> [ByteString] -> [ByteString] reChunkN n = go where go [] = [] go (b:bs) = case divMod (BS.length b) n of (_, 0) -> b : go bs (d, _) -> case BS.splitAt (d * n) b of ~(h, t) -> h : accum t bs accum acc [] = [acc] accum acc (c:cs) = case BS.splitAt (n - BS.length acc) c of ~(h, t) -> let acc' = BS.append acc h in if BS.length acc' == n then let cs' = if BS.null t then cs else t : cs in acc' : go cs' else accum acc' cs {-# INLINE reChunkN #-} peekWord32BE :: Ptr Word32 -> IO Word32 peekWord32BE p = case targetByteOrder of LittleEndian -> byteSwap32 <$> peek p BigEndian -> peek p {-# inline peekWord32BE #-} peekWord64BE :: Ptr Word64 -> IO Word64 peekWord64BE p = case targetByteOrder of LittleEndian -> byteSwap64 <$> peek p BigEndian -> peek p {-# inline peekWord64BE #-} base64-1.0/src/Data/ByteString/Base64/Internal/W16/0000755000000000000000000000000007346545000017536 5ustar0000000000000000base64-1.0/src/Data/ByteString/Base64/Internal/W16/Loop.hs0000644000000000000000000002135007346545000021004 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE MultiWayIf #-} {-# LANGUAGE TypeApplications #-} -- | -- Module : Data.ByteString.Base64.Internal.W16.Loop -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : Experimental -- Portability : portable -- -- 'Word8' fallback loop -- module Data.ByteString.Base64.Internal.W16.Loop ( innerLoop , decodeLoop , decodeLoopNoError , lenientLoop ) where import Data.Bits import Data.ByteString.Internal import Data.ByteString.Base64.Internal.Utils import Data.Text (Text) import qualified Data.Text as T import Foreign.ForeignPtr import Foreign.Ptr import Foreign.Storable import GHC.Word -- | Encoding inner loop. Packs 3 bytes from src pointer into -- the first 6 bytes of 4 'Word8''s (using the encoding table, -- as 2 'Word12''s ), writing these to the dst pointer. -- innerLoop :: Ptr Word16 -> Ptr Word8 -> Ptr Word16 -> Ptr Word8 -> (Ptr Word8 -> Ptr Word8 -> IO ByteString) -> IO ByteString innerLoop !etable !sptr !dptr !end finish = go sptr dptr where go !src !dst | plusPtr src 2 >= end = finish src (castPtr dst) | otherwise = do !i <- w32 <$> peek src !j <- w32 <$> peek (plusPtr src 1) !k <- w32 <$> peek (plusPtr src 2) let !w = unsafeShiftL i 16 .|. unsafeShiftL j 8 .|. k !x <- peekElemOff etable (fromIntegral (unsafeShiftR w 12)) !y <- peekElemOff etable (fromIntegral (w .&. 0xfff)) poke dst x poke (plusPtr dst 2) y go (plusPtr src 3) (plusPtr dst 4) {-# inline innerLoop #-} decodeLoop :: Ptr Word8 -- ^ decode lookup table -> Ptr Word8 -- ^ src pointer -> Ptr Word8 -- ^ dst pointer -> Ptr Word8 -> ForeignPtr Word8 -> IO (Either Text ByteString) decodeLoop !dtable !sptr !dptr !end !dfp = go dptr sptr where err :: Ptr Word8 -> IO (Either Text ByteString) err p = return . Left . T.pack $ "invalid base64 encoding near offset: " ++ show (p `minusPtr` sptr) padErr :: Ptr Word8 -> IO (Either Text ByteString) padErr p = return . Left . T.pack $ "invalid padding near offset: " ++ show (p `minusPtr` sptr) canonErr :: Ptr Word8 -> IO (Either Text ByteString) canonErr p = return . Left . T.pack $ "non-canonical encoding detected at offset: " ++ show (p `minusPtr` sptr) look :: Ptr Word8 -> IO Word32 look !p = do !i <- peekByteOff @Word8 p 0 !v <- peekByteOff @Word8 dtable (fromIntegral i) return (fromIntegral v) go !dst !src | plusPtr src 4 >= end = do a <- look src b <- look (src `plusPtr` 1) c <- look (src `plusPtr` 2) d <- look (src `plusPtr` 3) finalChunk dst src a b c d | otherwise = do a <- look src b <- look (src `plusPtr` 1) c <- look (src `plusPtr` 2) d <- look (src `plusPtr` 3) decodeChunk dst src a b c d -- | Decodes chunks of 4 bytes at a time, recombining into -- 3 bytes. Note that in the inner loop stage, no padding -- characters are admissible. -- decodeChunk !dst !src a b c d | a == 0x63 = padErr src | b == 0x63 = padErr (plusPtr src 1) | c == 0x63 = padErr (plusPtr src 2) | d == 0x63 = padErr (plusPtr src 3) | a == 0xff = err src | b == 0xff = err (plusPtr src 1) | c == 0xff = err (plusPtr src 2) | d == 0xff = err (plusPtr src 3) | otherwise = do let !w = (unsafeShiftL a 18 .|. unsafeShiftL b 12 .|. unsafeShiftL c 6 .|. d) :: Word32 poke @Word8 dst (fromIntegral (unsafeShiftR w 16)) poke @Word8 (plusPtr dst 1) (fromIntegral (unsafeShiftR w 8)) poke @Word8 (plusPtr dst 2) (fromIntegral w) go (plusPtr dst 3) (plusPtr src 4) -- | Decode the final 4 bytes in the string, recombining into -- 3 bytes. Note that in this stage, we can have padding chars -- but only in the final 2 positions. -- finalChunk !dst !src a b c d | a == 0x63 = padErr src | b == 0x63 = padErr (plusPtr src 1) | c == 0x63 && d /= 0x63 = err (plusPtr src 3) -- make sure padding is coherent. | a == 0xff = err src | b == 0xff = err (plusPtr src 1) | c == 0xff = err (plusPtr src 2) | d == 0xff = err (plusPtr src 3) | otherwise = do let !w = (unsafeShiftL a 18 .|. unsafeShiftL b 12 .|. unsafeShiftL c 6 .|. d) :: Word32 poke @Word8 dst (fromIntegral (unsafeShiftR w 16)) if c == 0x63 && d == 0x63 then if validateLastPos b mask_4bits then return $ Right $ PS dfp 0 (1 + (dst `minusPtr` dptr)) else canonErr (plusPtr src 1) else if d == 0x63 then if validateLastPos c mask_2bits then do poke @Word8 (plusPtr dst 1) (fromIntegral (unsafeShiftR w 8)) return $ Right $ PS dfp 0 (2 + (dst `minusPtr` dptr)) else canonErr (plusPtr src 2) else do poke @Word8 (plusPtr dst 1) (fromIntegral (unsafeShiftR w 8)) poke @Word8 (plusPtr dst 2) (fromIntegral w) return $ Right $ PS dfp 0 (3 + (dst `minusPtr` dptr)) {-# inline decodeLoop #-} decodeLoopNoError :: Ptr Word8 -- ^ decode lookup table -> Ptr Word8 -- ^ src pointer -> Ptr Word8 -- ^ dst pointer -> Ptr Word8 -> ForeignPtr Word8 -> IO ByteString decodeLoopNoError !dtable !sptr !dptr !end !dfp = go dptr sptr where look :: Ptr Word8 -> IO Word32 look !p = do !i <- peek p !v <- peekByteOff @Word8 dtable (fromIntegral i) pure $ fromIntegral v go !dst !src | plusPtr src 4 >= end = do a <- look src b <- look (src `plusPtr` 1) c <- look (src `plusPtr` 2) d <- look (src `plusPtr` 3) let !w = (unsafeShiftL a 18 .|. unsafeShiftL b 12 .|. unsafeShiftL c 6 .|. d) :: Word32 poke @Word8 dst (fromIntegral (unsafeShiftR w 16)) if c == 0x63 && d == 0x63 then pure $ PS dfp 0 (1 + (dst `minusPtr` dptr)) else if d == 0x63 then do poke @Word8 (plusPtr dst 1) (fromIntegral (unsafeShiftR w 8)) pure $ PS dfp 0 (2 + (dst `minusPtr` dptr)) else do poke @Word8 (plusPtr dst 1) (fromIntegral (unsafeShiftR w 8)) poke @Word8 (plusPtr dst 2) (fromIntegral w) pure $ PS dfp 0 (3 + (dst `minusPtr` dptr)) | otherwise = do a <- look src b <- look (src `plusPtr` 1) c <- look (src `plusPtr` 2) d <- look (src `plusPtr` 3) let !w = (unsafeShiftL a 18 .|. unsafeShiftL b 12 .|. unsafeShiftL c 6 .|. d) :: Word32 poke @Word8 dst (fromIntegral (unsafeShiftR w 16)) poke @Word8 (plusPtr dst 1) (fromIntegral (unsafeShiftR w 8)) poke @Word8 (plusPtr dst 2) (fromIntegral w) go (plusPtr dst 3) (plusPtr src 4) {-# inline decodeLoopNoError #-} lenientLoop :: Ptr Word8 -- ^ decode lookup table -> Ptr Word8 -- ^ src pointer -> Ptr Word8 -- ^ dst pointer -> Ptr Word8 -- ^ end of src ptr -> ForeignPtr Word8 -- ^ dst foreign ptr (for consing bs) -> IO ByteString lenientLoop !dtable !sptr !dptr !end !dfp = go dptr sptr 0 where finalize !n = return $ PS dfp 0 n {-# INLINE finalize #-} look !skip !p_ f = k p_ where k !p | p >= end = f (plusPtr end (-1)) (0x63 :: Word32) | otherwise = do !i <- peekByteOff @Word8 p 0 !v <- peekByteOff @Word8 dtable (fromIntegral i) if | v == 0xff -> k (plusPtr p 1) | v == 0x63, skip -> k (plusPtr p 1) | otherwise -> f (plusPtr p 1) (fromIntegral v) go !dst !src !n | src >= end = finalize n | otherwise = look True src $ \ap a -> look True ap $ \bp b -> if | a == 0x63 -> finalize n | b == 0x63 -> finalize n | otherwise -> look False bp $ \cp c -> look False cp $ \dp d -> do let !w = unsafeShiftL a 18 .|. unsafeShiftL b 12 .|. unsafeShiftL c 6 .|. d poke @Word8 dst (fromIntegral (unsafeShiftR w 16)) if c == 0x63 then finalize (n + 1) else do poke @Word8 (plusPtr dst 1) (fromIntegral (w `unsafeShiftR` 8)) if d == 0x63 then finalize (n + 2) else do poke @Word8 (plusPtr dst 2) (fromIntegral w) go (plusPtr dst 3) dp (n + 3) base64-1.0/src/Data/ByteString/Base64/Internal/W64/0000755000000000000000000000000007346545000017541 5ustar0000000000000000base64-1.0/src/Data/ByteString/Base64/Internal/W64/Loop.hs0000644000000000000000000000565107346545000021015 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} -- | -- Module : Data.ByteString.Base64.Internal.W64.Loop -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : Experimental -- Portability : portable -- -- 'Word64'-optimized inner loops -- module Data.ByteString.Base64.Internal.W64.Loop ( innerLoop , decodeLoop , decodeLoopNoError , lenientLoop ) where import Data.Bits import Data.ByteString.Internal import Data.ByteString.Base64.Internal.Utils import qualified Data.ByteString.Base64.Internal.W16.Loop as W16 import Data.Text (Text) import Foreign.ForeignPtr import Foreign.Ptr import Foreign.Storable import GHC.Word -- | Encoding inner loop. Packs 6 bytes from src pointer into -- the first 6 bits of 4 'Word12''s (using the encoding table, -- as 2 'Word12''s ), writing these to the dst pointer. -- innerLoop :: Ptr Word16 -> Ptr Word64 -> Ptr Word64 -> Ptr Word64 -> (Ptr Word8 -> Ptr Word8 -> IO ByteString) -> IO ByteString innerLoop !etable !sptr !dptr !end finish = go sptr dptr where go !src !dst | plusPtr src 7 >= end = W16.innerLoop etable (castPtr src) (castPtr dst) (castPtr end) finish | otherwise = do !t <- peekWord64BE src let !a = unsafeShiftR t 52 .&. 0xfff !b = unsafeShiftR t 40 .&. 0xfff !c = unsafeShiftR t 28 .&. 0xfff !d = unsafeShiftR t 16 .&. 0xfff !w <- w64_16 <$> peekElemOff etable (fromIntegral a) !x <- w64_16 <$> peekElemOff etable (fromIntegral b) !y <- w64_16 <$> peekElemOff etable (fromIntegral c) !z <- w64_16 <$> peekElemOff etable (fromIntegral d) let !xx = w .|. unsafeShiftL x 16 .|. unsafeShiftL y 32 .|. unsafeShiftL z 48 poke dst (fromIntegral xx) go (plusPtr src 6) (plusPtr dst 8) {-# inline innerLoop #-} decodeLoop :: Ptr Word8 -- ^ decode lookup table -> Ptr Word8 -- ^ src pointer -> Ptr Word8 -- ^ dst pointer -> Ptr Word8 -- ^ end of src ptr -> ForeignPtr Word8 -- ^ dst foreign ptr (for consing bs) -> IO (Either Text ByteString) decodeLoop = W16.decodeLoop {-# inline decodeLoop #-} decodeLoopNoError :: Ptr Word8 -- ^ decode lookup table -> Ptr Word8 -- ^ src pointer -> Ptr Word8 -- ^ dst pointer -> Ptr Word8 -- ^ end of src ptr -> ForeignPtr Word8 -- ^ dst foreign ptr (for consing bs) -> IO ByteString decodeLoopNoError = W16.decodeLoopNoError {-# inline decodeLoopNoError #-} lenientLoop :: Ptr Word8 -- ^ decode lookup table -> Ptr Word8 -- ^ src pointer -> Ptr Word8 -- ^ dst pointer -> Ptr Word8 -- ^ end of src ptr -> ForeignPtr Word8 -- ^ dst foreign ptr (for consing bs) -> IO ByteString lenientLoop = W16.lenientLoop base64-1.0/src/Data/ByteString/Base64/URL.hs0000644000000000000000000002433107346545000016406 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE DataKinds #-} {-# LANGUAGE TypeFamilies #-} -- | -- Module : Data.ByteString.Base64.URL -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.ByteString.ByteString'-valued combinators for -- implementing the RFC 4648 specification of the url-safe Base64 (Base64url) -- encoding format. This includes strictly padded/unpadded and lenient decoding -- variants, as well as internal and external validation for canonicity. -- module Data.ByteString.Base64.URL ( -- * Encoding encodeBase64 , encodeBase64' , encodeBase64Unpadded , encodeBase64Unpadded' -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64Unpadded , decodeBase64UnpaddedUntyped , decodeBase64Padded , decodeBase64PaddedUntyped , decodeBase64Lenient -- * Validation , isBase64Url , isValidBase64Url ) where import Data.Base64.Types import Data.Base64.Types.Internal import qualified Data.ByteString as BS import Data.ByteString.Internal (ByteString(..)) import Data.ByteString.Base64.Internal import Data.ByteString.Base64.Internal.Head import Data.ByteString.Base64.Internal.Tables import Data.Either (isRight) import Data.Text (Text) import qualified Data.Text.Encoding as T import System.IO.Unsafe -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'ByteString' value as a Base64url 'Text' value with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "<>" -- "PDw_Pj4=" -- encodeBase64 :: ByteString -> Base64 'UrlPadded Text encodeBase64 = fmap T.decodeUtf8 . encodeBase64' {-# INLINE encodeBase64 #-} -- | Encode a 'ByteString' as a Base64url 'ByteString' value with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64' "<>" -- "PDw_Pj4=" -- encodeBase64' :: ByteString -> Base64 'UrlPadded ByteString encodeBase64' = assertBase64 . encodeBase64_ base64UrlTable -- | Decode a Base64url encoded 'ByteString' value, either padded or unpadded. -- The correct decoding function is dispatched based on the existence of padding. -- -- For typed values: -- - If a padded value is required, use 'decodeBase64Padded' -- - If an unpadded value is required, use 'decodeBase64Unpadded' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64 $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64 :: UrlAlphabet k => Base64 k ByteString -> ByteString decodeBase64 b64@(Base64 bs) | not $ BS.null bs, BS.last bs == 0x3d = decodeBase64Padded $ coerceBase64 b64 | otherwise = decodeBase64Unpadded $ coerceBase64 b64 {-# inline decodeBase64 #-} -- | Decode an untyped Base64url encoded 'ByteString' value. If its length is not a multiple -- of 4, then padding chars will be added to fill out the input to a multiple of -- 4 for safe decoding as Base64url-encoded values are optionally padded. -- -- For a decoder that fails to decode untyped values of incorrect size: -- - If a padded value is required, use 'decodeBase64PaddedUntyped' -- - If an unpadded value is required, use 'decodeBase64UnpaddedUntyped' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "PDw_Pj4=" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw-Pg=" -- Left "Base64-encoded bytestring has invalid padding" -- -- >>> decodeBase64Untyped "PDw-Pg" -- Right "<<>>" -- decodeBase64Untyped :: ByteString -> Either Text ByteString decodeBase64Untyped bs@(PS _ _ !l) | l == 0 = Right mempty | r == 0 = unsafeDupablePerformIO $ decodeBase64_ decodeB64UrlTable bs | r == 2 = unsafeDupablePerformIO $ decodeBase64_ decodeB64UrlTable $ BS.append bs "==" | r == 3 = validateLastPad bs $ decodeBase64_ decodeB64UrlTable $ BS.append bs "=" | otherwise = Left "Base64-encoded bytestring has invalid size" where !r = l `rem` 4 {-# INLINE decodeBase64Untyped #-} -- | Encode a 'ByteString' value as Base64url 'Text' without padding. Note that for Base64url, -- padding is optional. If you call this function, you will simply be encoding -- as Base64url and stripping padding chars from the output. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64Unpadded "<>" -- "PDw_Pj4" -- encodeBase64Unpadded :: ByteString -> Base64 'UrlUnpadded Text encodeBase64Unpadded = fmap T.decodeUtf8 . encodeBase64Unpadded' {-# INLINE encodeBase64Unpadded #-} -- | Encode a 'ByteString' value as Base64url without padding. Note that for Base64url, -- padding is optional. If you call this function, you will simply be encoding -- as Base64url and stripping padding chars from the output. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64Unpadded' "<>" -- "PDw_Pj4" -- encodeBase64Unpadded' :: ByteString -> Base64 'UrlUnpadded ByteString encodeBase64Unpadded' = assertBase64 . encodeBase64Nopad_ base64UrlTable -- | Decode an unpadded Base64url-encoded 'ByteString' value. Input strings are -- required to be unpadded, and will undergo validation prior to decoding to -- confirm. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Unpadded $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64Unpadded :: Base64 'UrlUnpadded ByteString -> ByteString decodeBase64Unpadded b64@(Base64 (PS _ _ !l)) | r == 2 = decodeBase64Padded $ coerceBase64 $ (`BS.append` "==") <$> b64 | r == 3 = decodeBase64Padded $ coerceBase64 $ (`BS.append` "=") <$> b64 | otherwise = decodeBase64Padded $ coerceBase64 b64 where !r = l `rem` 4 -- | Decode a padded, untyped Base64url-encoded 'ByteString' value. Input strings are -- required to be correctly padded, and will be validated prior to decoding -- to confirm. -- -- In general, unless padded Base64url is explicitly required, it is -- safer to call 'decodeBase64Untyped'. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64PaddedUntyped "PDw_Pj4=" -- Right "<>" -- decodeBase64PaddedUntyped :: ByteString -> Either Text ByteString decodeBase64PaddedUntyped bs@(PS _ _ !l) | l == 0 = Right mempty | r == 1 = Left "Base64-encoded bytestring has invalid size" | r /= 0 = Left "Base64-encoded bytestring requires padding" | otherwise = unsafeDupablePerformIO $ decodeBase64_ decodeB64UrlTable bs where !r = l `rem` 4 {-# INLINE decodeBase64PaddedUntyped #-} -- | Decode a padded Base64url-encoded 'ByteString' value. Input strings are -- required to be correctly padded, and will be validated prior to decoding -- to confirm. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Padded $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- decodeBase64Padded :: Base64 'UrlPadded ByteString -> ByteString decodeBase64Padded = decodeBase64Typed_ decodeB64UrlTable {-# INLINE decodeBase64Padded #-} -- | Decode an unpadded, untyped Base64url-encoded 'ByteString' value. Input strings are -- required to be unpadded, and will undergo validation prior to decoding to -- confirm. -- -- In general, unless unpadded Base64url is explicitly required, it is -- safer to call 'decodeBase64Untyped'. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4=" -- Left "Base64-encoded bytestring has invalid padding" -- decodeBase64UnpaddedUntyped :: ByteString -> Either Text ByteString decodeBase64UnpaddedUntyped bs@(PS _ _ !l) | l == 0 = Right mempty | r == 0 = validateLastPad bs $ decodeBase64_ decodeB64UrlTable bs | r == 2 = validateLastPad bs $ decodeBase64_ decodeB64UrlTable $ BS.append bs "==" | r == 3 = validateLastPad bs $ decodeBase64_ decodeB64UrlTable $ BS.append bs "=" | otherwise = Left "Base64-encoded bytestring has invalid size" where !r = l `rem` 4 {-# INLINE decodeBase64UnpaddedUntyped #-} -- | Leniently decode an unpadded, untyped Base64url-encoded 'ByteString'. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64Lenient "PDw_%%%$}Pj4" -- "<>" -- decodeBase64Lenient :: ByteString -> ByteString decodeBase64Lenient = decodeBase64Lenient_ decodeB64UrlTable {-# INLINE decodeBase64Lenient #-} -- | Tell whether a 'ByteString' is encoded in padded /or/ unpadded Base64url format. -- -- This function will also detect non-canonical encodings such as @ZE==@, which are -- externally valid Base64url-encoded values, but are internally inconsistent "impossible" -- values. -- -- === __Examples__: -- -- >>> isBase64Url "PDw_Pj4=" -- True -- -- >>> isBase64Url "PDw_Pj4" -- True -- -- >>> isBase64Url "PDw_Pj" -- False -- isBase64Url :: ByteString -> Bool isBase64Url bs = isValidBase64Url bs && isRight (decodeBase64Untyped bs) {-# INLINE isBase64Url #-} -- | Tell whether a 'ByteString' is a valid Base64url format. -- -- This will not tell you whether or not this is a correct Base64url representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'ByteString' value, use 'isBase64Url'. -- -- === __Examples__: -- -- >>> isValidBase64Url "PDw_Pj4=" -- True -- -- >>> isValidBase64Url "PDw_Pj" -- True -- -- >>> isValidBase64Url "%" -- False -- isValidBase64Url :: ByteString -> Bool isValidBase64Url = validateBase64Url "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_" {-# INLINE isValidBase64Url #-} base64-1.0/src/Data/ByteString/Lazy/0000755000000000000000000000000007346545000015300 5ustar0000000000000000base64-1.0/src/Data/ByteString/Lazy/Base64.hs0000644000000000000000000001205107346545000016657 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE DataKinds #-} -- | -- Module : Data.ByteString.Lazy.Base64 -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.ByteString.Lazy.ByteString'-valued combinators for -- implementing the RFC 4648 specification of the Base64 -- encoding format. This includes lenient decoding variants, as well as -- internal and external validation for canonicity. -- module Data.ByteString.Lazy.Base64 ( -- * Encoding encodeBase64 , encodeBase64' -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64Lenient -- * Validation , isBase64 , isValidBase64 ) where import Data.Base64.Types import qualified Data.ByteString as BS import qualified Data.ByteString.Base64 as B64 import Data.ByteString.Base64.Internal.Utils (reChunkN) import Data.ByteString.Lazy (fromChunks, toChunks) import qualified Data.ByteString.Lazy as BL import Data.ByteString.Lazy.Internal (ByteString(..)) import Data.Either (isRight) import qualified Data.Text as T import qualified Data.Text.Lazy as TL import qualified Data.Text.Lazy.Encoding as TL -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'ByteString' value as Base64 'Text' with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "Sun" -- "U3Vu" -- encodeBase64 :: ByteString -> Base64 'StdPadded TL.Text encodeBase64 = fmap TL.decodeUtf8 . encodeBase64' {-# INLINE encodeBase64 #-} -- | Encode a 'ByteString' value as a Base64 'ByteString' value with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64' "Sun" -- "U3Vu" -- encodeBase64' :: ByteString -> Base64 'StdPadded ByteString encodeBase64' = assertBase64 . fromChunks . fmap (extractBase64 . B64.encodeBase64') . reChunkN 3 . toChunks {-# INLINE encodeBase64' #-} -- | Decode a padded Base64-encoded 'ByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'StdPadded "U3Vu" -- "Sun" -- decodeBase64 :: StdAlphabet k => Base64 k ByteString -> ByteString decodeBase64 = fromChunks . pure . B64.decodeBase64 . fmap (BS.concat . toChunks) {-# INLINE decodeBase64 #-} -- | Decode a padded untyped Base64-encoded 'ByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "U3Vu" -- Right "Sun" -- -- >>> decodeBase64Untyped "U3V" -- Left "Base64-encoded bytestring requires padding" -- -- >>> decodeBase64Untyped "U3V=" -- Left "non-canonical encoding detected at offset: 2" -- decodeBase64Untyped :: ByteString -> Either T.Text ByteString decodeBase64Untyped = fmap (fromChunks . pure) . B64.decodeBase64Untyped . BS.concat . toChunks {-# inline decodeBase64Untyped #-} -- | Leniently decode an unpadded Base64-encoded 'ByteString' value. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "U3Vu" -- "Sun" -- -- >>> decodeBase64Lenient "U3V" -- "Su" -- -- >>> decodeBase64Lenient "U3V=" -- "Su" -- decodeBase64Lenient :: ByteString -> ByteString decodeBase64Lenient = fromChunks . fmap B64.decodeBase64Lenient . reChunkN 4 . fmap (BS.filter (`BL.elem` "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=")) . toChunks {-# INLINE decodeBase64Lenient #-} -- | Tell whether a 'ByteString' value is base64 encoded. -- -- This function will also detect non-canonical encodings such as @ZE==@, which are -- externally valid Base64-encoded values, but are internally inconsistent "impossible" -- values. -- -- === __Examples__: -- -- >>> isBase64 "U3Vu" -- True -- -- >>> isBase64 "U3V" -- False -- -- >>> isBase64 "U3V=" -- False -- isBase64 :: ByteString -> Bool isBase64 bs = isValidBase64 bs && isRight (decodeBase64Untyped bs) {-# INLINE isBase64 #-} -- | Tell whether a 'ByteString' value is a valid Base64 format. -- -- This will not tell you whether or not this is a correct Base64 representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'ByteString' value, use 'isBase64'. -- -- === __Examples__: -- -- >>> isValidBase64 "U3Vu" -- True -- -- >>> isValidBase64 "U3V" -- True -- -- >>> isValidBase64 "U3V=" -- True -- -- >>> isValidBase64 "%" -- False -- isValidBase64 :: ByteString -> Bool isValidBase64 = go . toChunks where go [] = True go [c] = B64.isValidBase64 c go (c:cs) = -- note the lack of padding char BS.all (`BL.elem` "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/") c && go cs {-# INLINE isValidBase64 #-} base64-1.0/src/Data/ByteString/Lazy/Base64/0000755000000000000000000000000007346545000016324 5ustar0000000000000000base64-1.0/src/Data/ByteString/Lazy/Base64/URL.hs0000644000000000000000000002217607346545000017332 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE DataKinds #-} {-# LANGUAGE TypeFamilies #-} -- | -- Module : Data.ByteString.Lazy.Base64.URL -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.ByteString.Lazy.ByteString'-valued combinators for -- implementing the RFC 4648 specification of the Base64url -- encoding format. This includes strictly padded/unpadded and lenient -- decoding variants, as well as internal and external validation for canonicity. -- module Data.ByteString.Lazy.Base64.URL ( -- * Encoding encodeBase64 , encodeBase64' , encodeBase64Unpadded , encodeBase64Unpadded' -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64Unpadded , decodeBase64UnpaddedUntyped , decodeBase64Padded , decodeBase64PaddedUntyped , decodeBase64Lenient -- * Validation , isBase64Url , isValidBase64Url ) where import Data.Base64.Types import qualified Data.ByteString as BS import qualified Data.ByteString.Base64.URL as B64U import Data.ByteString.Base64.Internal.Utils (reChunkN) import Data.ByteString.Lazy (fromChunks, toChunks) import qualified Data.ByteString.Lazy as BL import Data.ByteString.Lazy.Internal (ByteString(..)) import Data.Either (isRight) import qualified Data.Text as T import qualified Data.Text.Lazy as TL import qualified Data.Text.Lazy.Encoding as TL -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'ByteString' value as a Base64url 'Text' value with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "<>" -- "PDw_Pj4=" -- encodeBase64 :: ByteString -> Base64 'UrlPadded TL.Text encodeBase64 = fmap TL.decodeUtf8 . encodeBase64' {-# INLINE encodeBase64 #-} -- | Encode a 'ByteString' as a Base64url 'ByteString' value with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64' "<>" -- "PDw_Pj4=" -- encodeBase64' :: ByteString -> Base64 'UrlPadded ByteString encodeBase64' = assertBase64 . fromChunks . fmap (extractBase64 . B64U.encodeBase64') . reChunkN 3 . toChunks -- | Decode a Base64url encoded 'ByteString' value, either padded or unpadded. -- The correct decoding function is dispatched based on the existence of padding. -- -- For typed values: -- - If a padded value is required, use 'decodeBase64Padded' -- - If an unpadded value is required, use 'decodeBase64Unpadded' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64 $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- -- >>> decodeBase64 $ assertBase64 @'UrlUnpadded "PDw-Pg" -- "<<>>" -- decodeBase64 :: UrlAlphabet k => Base64 k ByteString -> ByteString decodeBase64 = fromChunks . pure . B64U.decodeBase64 . fmap (BS.concat . toChunks) {-# INLINE decodeBase64 #-} -- | Decode a padded Base64url encoded 'ByteString' value. If its length is not a multiple -- of 4, then padding chars will be added to fill out the input to a multiple of -- 4 for safe decoding as Base64url-encoded values are optionally padded. -- -- For a decoder that fails to decode untyped values of incorrect size: -- - If a padded value is required, use 'decodeBase64PaddedUntyped' -- - If an unpadded value is required, use 'decodeBase64UnpaddedUntyped' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "PDw_Pj4=" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw-Pg=" -- Left "Base64-encoded bytestring has invalid padding" -- -- >>> decodeBase64Untyped "PDw-Pg" -- Right "<<>>" -- decodeBase64Untyped :: ByteString -> Either T.Text ByteString decodeBase64Untyped = fmap (fromChunks . pure) . B64U.decodeBase64Untyped . BS.concat . toChunks {-# INLINE decodeBase64Untyped #-} -- | Encode a 'ByteString' value as Base64url 'Text' without padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64Unpadded "<>" -- "PDw_Pj4" -- encodeBase64Unpadded :: ByteString -> Base64 'UrlUnpadded TL.Text encodeBase64Unpadded = fmap TL.decodeUtf8 . encodeBase64Unpadded' {-# INLINE encodeBase64Unpadded #-} -- | Encode a 'ByteString' value as Base64url without padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64Unpadded' "<>" -- "PDw_Pj4" -- encodeBase64Unpadded' :: ByteString -> Base64 'UrlUnpadded ByteString encodeBase64Unpadded' = assertBase64 . fromChunks . fmap (extractBase64 . B64U.encodeBase64Unpadded') . reChunkN 3 . toChunks -- | Decode an unpadded Base64url-encoded 'ByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Unpadded $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64Unpadded :: Base64 'UrlUnpadded ByteString -> ByteString decodeBase64Unpadded = fromChunks . pure . B64U.decodeBase64Unpadded . fmap (BS.concat . toChunks) -- | Decode an unpadded, untyped Base64url-encoded 'ByteString' value. Input strings are -- required to be unpadded, and will undergo validation prior to decoding to -- confirm. -- -- In general, unless unpadded Base64url is explicitly required, it is -- safer to call 'decodeBase64Untyped'. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4=" -- Left "Base64-encoded bytestring has invalid padding" -- decodeBase64UnpaddedUntyped :: ByteString -> Either T.Text ByteString decodeBase64UnpaddedUntyped = fmap (fromChunks . (:[])) . B64U.decodeBase64UnpaddedUntyped . BS.concat . toChunks {-# INLINE decodeBase64UnpaddedUntyped #-} -- | Decode a padded Base64url-encoded 'ByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Unpadded $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64Padded :: Base64 'UrlPadded ByteString -> ByteString decodeBase64Padded = fromChunks . pure . B64U.decodeBase64Padded . fmap (BS.concat . toChunks) {-# inline decodeBase64Padded #-} -- | Decode a padded, untyped Base64url-encoded 'ByteString' value. Input strings are -- required to be correctly padded, and will be validated prior to decoding -- to confirm. -- -- In general, unless padded Base64url is explicitly required, it is -- safer to call 'decodeBase64'. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64PaddedUntyped "PDw_Pj4=" -- Right "<>" -- -- >>> decodeBase64PaddedUntyped "PDw_Pj4" -- Left "Base64-encoded bytestring requires padding" -- decodeBase64PaddedUntyped :: ByteString -> Either T.Text ByteString decodeBase64PaddedUntyped = fmap (fromChunks . (:[])) . B64U.decodeBase64PaddedUntyped . BS.concat . toChunks {-# INLINE decodeBase64PaddedUntyped #-} -- | Leniently decode an unpadded, untyped Base64url-encoded 'ByteString'. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64Lenient "PDw_%%%$}Pj4" -- "<>" -- decodeBase64Lenient :: ByteString -> ByteString decodeBase64Lenient = fromChunks . fmap B64U.decodeBase64Lenient . reChunkN 4 . fmap (BS.filter (`BL.elem` "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_=")) . toChunks {-# INLINE decodeBase64Lenient #-} -- | Tell whether an untyped 'ByteString' is Base64url-encoded. -- -- === __Examples__: -- -- >>> isBase64Url "PDw_Pj4=" -- True -- -- >>> isBase64Url "PDw_Pj4" -- True -- -- >>> isBase64Url "PDw_Pj" -- False -- isBase64Url :: ByteString -> Bool isBase64Url bs = isValidBase64Url bs && isRight (decodeBase64Untyped bs) {-# INLINE isBase64Url #-} -- | Tell whether an untyped 'ByteString' is a valid Base64url format. -- -- This will not tell you whether or not this is a correct Base64url representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'ByteString' value, use 'isBase64Url'. -- -- === __Examples__: -- -- >>> isValidBase64Url "PDw_Pj4=" -- True -- -- >>> isValidBase64Url "PDw_Pj" -- True -- -- >>> isValidBase64Url "%" -- False -- isValidBase64Url :: ByteString -> Bool isValidBase64Url = go . toChunks where go [] = True go [c] = B64U.isValidBase64Url c go (c:cs) = -- note the lack of padding char BS.all (`BL.elem` "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_") c && go cs {-# INLINE isValidBase64Url #-} base64-1.0/src/Data/ByteString/Short/0000755000000000000000000000000007346545000015460 5ustar0000000000000000base64-1.0/src/Data/ByteString/Short/Base64.hs0000644000000000000000000001071707346545000017046 0ustar0000000000000000{-# LANGUAGE DataKinds #-} -- | -- Module : Data.ByteString.Short.Base64 -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.ByteString.Short.ShortByteString'-valued combinators for -- implementing the RFC 4648 specification of the Base64 -- encoding format. This includes lenient decoding variants, as well as -- internal and external validation for canonicity. -- module Data.ByteString.Short.Base64 ( -- * Encoding encodeBase64 , encodeBase64' -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64Lenient -- * Validation , isBase64 , isValidBase64 ) where import Data.Base64.Types import qualified Data.ByteString.Base64 as B64 import Data.ByteString.Short (ShortByteString, fromShort, toShort) import Data.Text (Text) import Data.Text.Short (ShortText) import Data.Text.Short.Unsafe (fromShortByteStringUnsafe) -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'ShortByteString' value as Base64 'ShortText' with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "Sun" -- "U3Vu" -- encodeBase64 :: ShortByteString -> Base64 'StdPadded ShortText encodeBase64 = fmap fromShortByteStringUnsafe . encodeBase64' {-# INLINE encodeBase64 #-} -- | Encode a 'ShortByteString' value as a Base64 'ShortByteString' value with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64' "Sun" -- "U3Vu" -- encodeBase64' :: ShortByteString -> Base64 'StdPadded ShortByteString encodeBase64' = fmap toShort . B64.encodeBase64' . fromShort {-# INLINE encodeBase64' #-} -- | Decode a padded Base64-encoded 'ShortByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'StdPadded "U3Vu" -- "Sun" -- decodeBase64 :: StdAlphabet k => Base64 k ShortByteString -> ShortByteString decodeBase64 = toShort . B64.decodeBase64 . fmap fromShort {-# INLINE decodeBase64 #-} -- | Decode a padded Base64-encoded 'ShortByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "U3Vu" -- Right "Sun" -- -- >>> decodeBase64Untyped "U3V" -- Left "Base64-encoded bytestring requires padding" -- -- >>> decodeBase64Untyped "U3V=" -- Left "non-canonical encoding detected at offset: 2" -- decodeBase64Untyped :: ShortByteString -> Either Text ShortByteString decodeBase64Untyped = fmap toShort . B64.decodeBase64Untyped . fromShort {-# inline decodeBase64Untyped #-} -- | Leniently decode an unpadded Base64-encoded 'ShortByteString' value. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "U3Vu" -- "Sun" -- -- >>> decodeBase64Lenient "U3V" -- "Su" -- -- >>> decodeBase64Lenient "U3V=" -- "Su" -- decodeBase64Lenient :: ShortByteString -> ShortByteString decodeBase64Lenient = toShort . B64.decodeBase64Lenient . fromShort {-# INLINE decodeBase64Lenient #-} -- | Tell whether a 'ShortByteString' value is base64 encoded. -- -- This function will also detect non-canonical encodings such as @ZE==@, which are -- externally valid Base64-encoded values, but are internally inconsistent "impossible" -- values. -- -- === __Examples__: -- -- >>> isBase64 "U3Vu" -- True -- -- >>> isBase64 "U3V" -- False -- -- >>> isBase64 "U3V=" -- False -- isBase64 :: ShortByteString -> Bool isBase64 = B64.isBase64 . fromShort {-# INLINE isBase64 #-} -- | Tell whether a 'ShortByteString' value is a valid Base64 format. -- -- This will not tell you whether or not this is a correct Base64 representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'ShortByteString' value, use 'isBase64'. -- -- === __Examples__: -- -- >>> isValidBase64 "U3Vu" -- True -- -- >>> isValidBase64 "U3V" -- True -- -- >>> isValidBase64 "U3V=" -- True -- -- >>> isValidBase64 "%" -- False -- isValidBase64 :: ShortByteString -> Bool isValidBase64 = B64.isValidBase64 . fromShort {-# INLINE isValidBase64 #-} base64-1.0/src/Data/ByteString/Short/Base64/0000755000000000000000000000000007346545000016504 5ustar0000000000000000base64-1.0/src/Data/ByteString/Short/Base64/URL.hs0000644000000000000000000002061707346545000017510 0ustar0000000000000000{-# LANGUAGE DataKinds #-} -- | -- Module : Data.ByteString.Short.Base64.URL -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.ByteString.Short.ShortByteString'-valued combinators for -- implementing the RFC 4648 specification of the Base64url -- encoding format. This includes strictly padded/unpadded and lenient decoding -- variants, as well as internal and external validation for canonicity. -- module Data.ByteString.Short.Base64.URL ( -- * Encoding encodeBase64 , encodeBase64' , encodeBase64Unpadded , encodeBase64Unpadded' -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64Unpadded , decodeBase64UnpaddedUntyped , decodeBase64Padded , decodeBase64PaddedUntyped , decodeBase64Lenient -- * Validation , isBase64Url , isValidBase64Url ) where import Data.Base64.Types import qualified Data.ByteString.Base64.URL as B64U import Data.ByteString.Short (ShortByteString, fromShort, toShort) import Data.Text (Text) import Data.Text.Short (ShortText) import Data.Text.Short.Unsafe (fromShortByteStringUnsafe) -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'ShortByteString' value as a Base64url 'Text' value with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "<>" -- "PDw_Pj4=" -- encodeBase64 :: ShortByteString -> Base64 'UrlPadded ShortText encodeBase64 = fmap fromShortByteStringUnsafe . encodeBase64' {-# INLINE encodeBase64 #-} -- | Encode a 'ShortByteString' as a Base64url 'ShortByteString' value with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64' "<>" -- "PDw_Pj4=" -- encodeBase64' :: ShortByteString -> Base64 'UrlPadded ShortByteString encodeBase64' = fmap toShort . B64U.encodeBase64' . fromShort -- | Decode a Base64url encoded 'ShortByteString' value, either padded or unpadded. -- The correct decoding function is dispatched based on the existence of padding. -- -- For typed values: -- - If a padded value is required, use 'decodeBase64Padded' -- - If an unpadded value is required, use 'decodeBase64Unpadded' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64 $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64 :: UrlAlphabet k => Base64 k ShortByteString -> ShortByteString decodeBase64 = toShort . B64U.decodeBase64 . fmap fromShort -- | Decode an untyped Base64url encoded 'ByteString' value. If its length is not a multiple -- of 4, then padding chars will be added to fill out the input to a multiple of -- 4 for safe decoding as Base64url-encoded values are optionally padded. -- -- For a decoder that fails to decode untyped values of incorrect size: -- - If a padded value is required, use 'decodeBase64PaddedUntyped' -- - If an unpadded value is required, use 'decodeBase64UnpaddedUntyped' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "PDw_Pj4=" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw-Pg=" -- Left "Base64-encoded bytestring has invalid padding" -- -- >>> decodeBase64Untyped "PDw-Pg" -- Right "<<>>" -- decodeBase64Untyped :: ShortByteString -> Either Text ShortByteString decodeBase64Untyped = fmap toShort . B64U.decodeBase64Untyped . fromShort {-# inline decodeBase64Untyped #-} -- | Encode a 'ShortByteString' value as Base64url 'Text' without padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64Unpadded "<>" -- "PDw_Pj4" -- encodeBase64Unpadded :: ShortByteString -> Base64 'UrlUnpadded ShortText encodeBase64Unpadded = fmap fromShortByteStringUnsafe . encodeBase64Unpadded' {-# INLINE encodeBase64Unpadded #-} -- | Encode a 'ShortByteString' value as Base64url without padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64Unpadded' "<>" -- "PDw_Pj4" -- encodeBase64Unpadded' :: ShortByteString -> Base64 'UrlUnpadded ShortByteString encodeBase64Unpadded' = fmap toShort . B64U.encodeBase64Unpadded' . fromShort -- | Decode an unpadded Base64url-encoded 'ShortByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Unpadded $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64Unpadded :: Base64 'UrlUnpadded ShortByteString -> ShortByteString decodeBase64Unpadded = toShort . B64U.decodeBase64Unpadded . fmap fromShort {-# INLINE decodeBase64Unpadded #-} -- | Decode an unpadded, untyped Base64url encoded 'ByteString' value. -- If its length is not a multiple of 4, then padding chars will be added -- to fill out the input to a multiple of 4 for safe decoding as -- Base64url-encoded values are optionally padded. -- -- In general, unless unpadded Base64url is explicitly required, it is -- safer to call 'decodeBase64'. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64UnpaddedUntyped "PDw-Pg=" -- Left "Base64-encoded bytestring has invalid padding" -- -- >>> decodeBase64UnpaddedUntyped "PDw-Pg" -- Right "<<>>" -- decodeBase64UnpaddedUntyped :: ShortByteString -> Either Text ShortByteString decodeBase64UnpaddedUntyped = fmap toShort . B64U.decodeBase64UnpaddedUntyped . fromShort {-# inline decodeBase64UnpaddedUntyped #-} -- | Decode a padded Base64url-encoded 'ShortByteString' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Padded $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- decodeBase64Padded :: Base64 'UrlPadded ShortByteString -> ShortByteString decodeBase64Padded = toShort . B64U.decodeBase64Padded . fmap fromShort {-# INLINE decodeBase64Padded #-} -- | Decode a padded, untyped Base64url encoded 'ByteString' value. -- -- For a decoder that fails on unpadded input of incorrect size, -- use 'decodeBase64UnpaddedUntyped'. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64PaddedUntyped "PDw_Pj4=" -- Right "<>" -- -- >>> decodeBase64PaddedUntyped "PDw_Pj4" -- Left "Base64-encoded bytestring requires padding" -- decodeBase64PaddedUntyped :: ShortByteString -> Either Text ShortByteString decodeBase64PaddedUntyped = fmap toShort . B64U.decodeBase64PaddedUntyped . fromShort {-# inline decodeBase64PaddedUntyped #-} -- | Leniently decode an unpadded, untyped Base64url-encoded 'ShortByteString'. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64Lenient "PDw_%%%$}Pj4" -- "<>" -- decodeBase64Lenient :: ShortByteString -> ShortByteString decodeBase64Lenient = toShort . B64U.decodeBase64Lenient . fromShort {-# INLINE decodeBase64Lenient #-} -- | Tell whether an untyped 'ShortByteString' is Base64url-encoded. -- -- === __Examples__: -- -- >>> isBase64Url "PDw_Pj4=" -- True -- -- >>> isBase64Url "PDw_Pj4" -- True -- -- >>> isBase64Url "PDw_Pj" -- False -- isBase64Url :: ShortByteString -> Bool isBase64Url = B64U.isBase64Url . fromShort {-# INLINE isBase64Url #-} -- | Tell whether an untyped 'ShortByteString' is a valid Base64url format. -- -- This will not tell you whether or not this is a correct Base64url representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'ShortByteString' value, use 'isBase64Url'. -- -- === __Examples__: -- -- >>> isValidBase64Url "PDw_Pj4=" -- True -- -- >>> isValidBase64Url "PDw_Pj" -- True -- -- >>> isValidBase64Url "%" -- False -- isValidBase64Url :: ShortByteString -> Bool isValidBase64Url = B64U.isValidBase64Url . fromShort {-# INLINE isValidBase64Url #-} base64-1.0/src/Data/Text/Encoding/0000755000000000000000000000000007346545000014741 5ustar0000000000000000base64-1.0/src/Data/Text/Encoding/Base64.hs0000644000000000000000000001056407346545000016327 0ustar0000000000000000{-# LANGUAGE DataKinds #-} -- | -- Module : Data.Text.Encoding.Base64 -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.Text.Text'-valued combinators for -- implementing the RFC 4648 specification of the Base64 -- encoding format. This includes lenient decoding variants, as well as -- internal and external validation for canonicity. -- module Data.Text.Encoding.Base64 ( -- * Encoding encodeBase64 -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64UntypedWith , decodeBase64Lenient -- * Validation , isBase64 , isValidBase64 ) where import Data.Base64.Types import Data.Bifunctor (first) import Data.ByteString (ByteString) import qualified Data.ByteString.Base64 as B64 import Data.Text (Text) import qualified Data.Text.Encoding as T import Data.Text.Encoding.Base64.Error -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'Text' value in Base64 with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "Sun" -- "U3Vu" -- encodeBase64 :: Text -> Base64 'StdPadded Text encodeBase64 = B64.encodeBase64 . T.encodeUtf8 {-# INLINE encodeBase64 #-} -- | Decode a padded Base64-encoded 'Text' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'StdPadded "U3Vu" -- "Sun" -- decodeBase64 :: StdAlphabet k => Base64 k Text -> Text decodeBase64 = T.decodeUtf8 . B64.decodeBase64 . fmap T.encodeUtf8 {-# INLINE decodeBase64 #-} -- | Decode a padded, untyped Base64-encoded 'Text' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "U3Vu" -- Right "Sun" -- decodeBase64Untyped :: Text -> Either Text Text decodeBase64Untyped = fmap T.decodeUtf8 . B64.decodeBase64Untyped . T.encodeUtf8 {-# INLINE decodeBase64Untyped #-} -- | Attempt to decode an untyped 'Text' value as Base64, converting from -- 'ByteString' to 'Text' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Example__: -- -- @ -- 'decodeBase64UntypedWith' 'T.decodeUtf8'' -- :: 'ByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'Text' -- @ -- decodeBase64UntypedWith :: (ByteString -> Either err Text) -- ^ convert a bytestring to text (e.g. 'T.decodeUtf8'') -> ByteString -- ^ Input text to decode -> Either (Base64Error err) Text decodeBase64UntypedWith f t = case B64.decodeBase64Untyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64UntypedWith #-} -- | Leniently decode an untyped Base64-encoded 'Text' value. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "U3Vu" -- "Sun" -- -- >>> decodeBase64Lenient "U3V" -- "Su" -- -- >>> decodeBase64Lenient "U3V=" -- "Su" -- decodeBase64Lenient :: Text -> Text decodeBase64Lenient = T.decodeUtf8 . B64.decodeBase64Lenient . T.encodeUtf8 {-# INLINE decodeBase64Lenient #-} -- | Tell whether an untyped 'Text' value is Base64-encoded. -- -- === __Examples__: -- -- >>> isBase64 "U3Vu" -- True -- -- >>> isBase64 "U3V" -- False -- -- >>> isBase64 "U3V=" -- False -- isBase64 :: Text -> Bool isBase64 = B64.isBase64 . T.encodeUtf8 {-# INLINE isBase64 #-} -- | Tell whether an untyped 'Text' value is a valid Base64 format. -- -- This will not tell you whether or not this is a correct Base64 representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'Text' value, use 'isBase64'. -- -- === __Examples__: -- -- >>> isValidBase64 "U3Vu" -- True -- -- >>> isValidBase64 "U3V" -- True -- -- >>> isValidBase64 "U3V=" -- True -- -- >>> isValidBase64 "%" -- False -- isValidBase64 :: Text -> Bool isValidBase64 = B64.isValidBase64 . T.encodeUtf8 {-# INLINE isValidBase64 #-} base64-1.0/src/Data/Text/Encoding/Base64/0000755000000000000000000000000007346545000015765 5ustar0000000000000000base64-1.0/src/Data/Text/Encoding/Base64/Error.hs0000644000000000000000000000243107346545000017412 0ustar0000000000000000{-# LANGUAGE DeriveGeneric #-} -- | -- Module : Data.Text.Encoding.Base64.Error -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains the error types raised (not as exceptions!) -- in the decoding process. -- module Data.Text.Encoding.Base64.Error ( Base64Error(..) ) where import Control.DeepSeq (NFData(..)) import Control.Exception (Exception(..)) import Data.Text (Text) import GHC.Generics -- | This data type represents the type of decoding errors of -- various kinds as they pertain to decoding 'Text' values. -- Namely, to distinguish between decoding errors from opaque -- unicode exceptions caught in the unicode decoding process. -- data Base64Error e = DecodeError !Text -- ^ The error associated with decoding failure -- as a result of the Base64 decoding process | ConversionError !e -- ^ The error associated with the decoding failure -- as a result of the conversion process deriving ( Eq, Show , Generic -- ^ @since 4.2.2 ) -- | -- -- @since 4.2.2 -- instance Exception e => Exception (Base64Error e) -- | -- -- @since 4.2.2 -- instance NFData e => NFData (Base64Error e) base64-1.0/src/Data/Text/Encoding/Base64/URL.hs0000644000000000000000000002352507346545000016772 0ustar0000000000000000{-# LANGUAGE DataKinds #-} -- | -- Module : Data.Text.Encoding.Base64.URL -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.Text.Text'-valued combinators for -- implementing the RFC 4648 specification of the Base64url -- encoding format. This includes strictly padded/unpadded and lenient decoding -- variants, as well as internal and external validation for canonicity. -- module Data.Text.Encoding.Base64.URL ( -- * Encoding encodeBase64 , encodeBase64Unpadded -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64UntypedWith , decodeBase64Unpadded , decodeBase64UnpaddedUntyped , decodeBase64UnpaddedUntypedWith , decodeBase64Padded , decodeBase64PaddedUntyped , decodeBase64PaddedUntypedWith , decodeBase64Lenient -- * Validation , isBase64Url , isValidBase64Url ) where import Data.Base64.Types import Data.Bifunctor (first) import Data.ByteString (ByteString) import qualified Data.ByteString.Base64.URL as B64U import Data.Text (Text) import qualified Data.Text.Encoding as T import Data.Text.Encoding.Base64.Error -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'Text' value in Base64url with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "<>" -- "PDw_Pj4=" -- encodeBase64 :: Text -> Base64 'UrlPadded Text encodeBase64 = B64U.encodeBase64 . T.encodeUtf8 {-# INLINE encodeBase64 #-} -- | Decode a Base64url encoded 'Text' value, either padded or unpadded. -- The correct decoding function is dispatched based on the existence of padding. -- -- For typed values: -- - If a padded value is required, use 'decodeBase64Padded' -- - If an unpadded value is required, use 'decodeBase64Unpadded' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64 $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64 :: UrlAlphabet k => Base64 k Text -> Text decodeBase64 = T.decodeUtf8 . B64U.decodeBase64 . fmap T.encodeUtf8 {-# INLINE decodeBase64 #-} -- | Decode an untyped Base64url encoded 'Text' value. If its length is not a multiple -- of 4, then padding chars will be added to fill out the input to a multiple of -- 4 for safe decoding as Base64url-encoded values are optionally padded. -- -- For a decoder that fails to decode untyped values of incorrect size: -- - If a padded value is required, use 'decodeBase64PaddedUntyped' -- - If an unpadded value is required, use 'decodeBase64UnpaddedUntyped' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "PDw_Pj4=" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw-Pg=" -- Left "Base64-encoded bytestring has invalid padding" -- -- >>> decodeBase64Untyped "PDw-Pg" -- Right "<<>>" -- decodeBase64Untyped :: Text -> Either Text Text decodeBase64Untyped = fmap T.decodeUtf8 . B64U.decodeBase64Untyped . T.encodeUtf8 {-# inline decodeBase64Untyped #-} -- | Attempt to decode an untyped 'ByteString' value as Base64url, converting from -- 'ByteString' to 'Text' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Examples__: -- -- @ -- 'decodeBase64UntypedWith' 'T.decodeUtf8'' -- :: 'Text' -> 'Either' ('Base64Error' 'UnicodeException') 'Text' -- @ -- decodeBase64UntypedWith :: (ByteString -> Either err Text) -- ^ convert a bytestring to text (e.g. 'T.decodeUtf8'') -> ByteString -- ^ Input text to decode -> Either (Base64Error err) Text decodeBase64UntypedWith f t = case B64U.decodeBase64Untyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64UntypedWith #-} -- | Encode a 'Text' value in Base64url without padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64Unpadded "<>" -- "PDw_Pj4" -- encodeBase64Unpadded :: Text -> Base64 'UrlUnpadded Text encodeBase64Unpadded = B64U.encodeBase64Unpadded . T.encodeUtf8 {-# INLINE encodeBase64Unpadded #-} -- | Decode an unpadded Base64url encoded 'Text' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Unpadded $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64Unpadded :: Base64 'UrlUnpadded Text -> Text decodeBase64Unpadded = T.decodeUtf8 . B64U.decodeBase64Unpadded . fmap T.encodeUtf8 {-# INLINE decodeBase64Unpadded #-} -- | Decode a unpadded, untyped Base64url-encoded 'Text' value. If its length is not a multiple -- of 4, then padding chars will be added to fill out the input to a multiple of -- 4 for safe decoding as base64url encodings are optionally padded. -- -- For a decoder that fails on unpadded input, use 'decodeBase64PaddedUntyped' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64UnpaddedUntyped "PDw-Pg=" -- Left "Base64-encoded bytestring has invalid padding" -- -- >>> decodeBase64UnpaddedUntyped "PDw-Pg" -- Right "<<>>" -- decodeBase64UnpaddedUntyped :: Text -> Either Text Text decodeBase64UnpaddedUntyped = fmap T.decodeUtf8 . B64U.decodeBase64UnpaddedUntyped . T.encodeUtf8 {-# inline decodeBase64UnpaddedUntyped #-} -- | Attempt to decode an untyped, unpadded 'ByteString' value as Base64url, converting from -- 'ByteString' to 'Text' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Example__: -- -- @ -- 'decodeBase64UnpaddedUntypedWith' 'T.decodeUtf8'' -- :: 'ByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'Text' -- @ -- decodeBase64UnpaddedUntypedWith :: (ByteString -> Either err Text) -- ^ convert a bytestring to text (e.g. 'T.decodeUtf8'') -> ByteString -- ^ Input text to decode -> Either (Base64Error err) Text decodeBase64UnpaddedUntypedWith f t = case B64U.decodeBase64UnpaddedUntyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64UnpaddedUntypedWith #-} -- | Decode a padded Base64url encoded 'Text' value -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Padded $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- decodeBase64Padded :: Base64 'UrlPadded Text -> Text decodeBase64Padded = T.decodeUtf8 . B64U.decodeBase64Padded . fmap T.encodeUtf8 {-# INLINE decodeBase64Padded #-} -- | Decode an untyped, padded Base64url encoded 'Text' value -- -- For a decoder that fails on padded input, use 'decodeBase64UnpaddedUntyped' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64PaddedUntyped "PDw_Pj4=" -- Right "<>" -- decodeBase64PaddedUntyped :: Text -> Either Text Text decodeBase64PaddedUntyped = fmap T.decodeUtf8 . B64U.decodeBase64PaddedUntyped . T.encodeUtf8 {-# inline decodeBase64PaddedUntyped #-} -- | Attempt to decode a padded, untyped 'ByteString' value as Base64url, converting from -- 'ByteString' to 'Text' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Example__: -- -- @ -- 'decodeBase64PaddedWith' 'T.decodeUtf8'' -- :: 'ByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'Text' -- @ -- decodeBase64PaddedUntypedWith :: (ByteString -> Either err Text) -- ^ convert a bytestring to text (e.g. 'T.decodeUtf8'') -> ByteString -- ^ Input text to decode -> Either (Base64Error err) Text decodeBase64PaddedUntypedWith f t = case B64U.decodeBase64PaddedUntyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64PaddedUntypedWith #-} -- | Leniently decode an untyped Base64url-encoded 'Text'. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64Lenient "PDw_%%%$}Pj4" -- "<>" -- decodeBase64Lenient :: Text -> Text decodeBase64Lenient = T.decodeUtf8 . B64U.decodeBase64Lenient . T.encodeUtf8 {-# INLINE decodeBase64Lenient #-} -- | Tell whether an untyped 'Text' value is Base64url-encoded. -- -- === __Examples__: -- -- >>> isBase64Url "PDw_Pj4=" -- True -- -- >>> isBase64Url "PDw_Pj4" -- True -- -- >>> isBase64Url "PDw_Pj" -- False -- isBase64Url :: Text -> Bool isBase64Url = B64U.isBase64Url . T.encodeUtf8 {-# INLINE isBase64Url #-} -- | Tell whether an untyped 'Text' value is a valid Base64url format. -- -- This will not tell you whether or not this is a correct Base64url representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'Text' value, use 'isBase64Url'. -- -- === __Examples__: -- -- >>> isValidBase64Url "PDw_Pj4=" -- True -- -- >>> isValidBase64Url "PDw_Pj" -- True -- -- >>> isValidBase64Url "%" -- False -- isValidBase64Url :: Text -> Bool isValidBase64Url = B64U.isValidBase64Url . T.encodeUtf8 {-# INLINE isValidBase64Url #-} base64-1.0/src/Data/Text/Lazy/Encoding/0000755000000000000000000000000007346545000015660 5ustar0000000000000000base64-1.0/src/Data/Text/Lazy/Encoding/Base64.hs0000644000000000000000000001126507346545000017245 0ustar0000000000000000{-# LANGUAGE DataKinds #-} -- | -- Module : Data.Text.Lazy.Encoding.Base64 -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.Text.Lazy.Text'-valued combinators -- implementing the RFC 4648 specification for the Base64 -- encoding format. This includes lenient decoding variants, and -- external + internal validations for canonicity. -- module Data.Text.Lazy.Encoding.Base64 ( -- * Encoding encodeBase64 -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64UntypedWith , decodeBase64Lenient -- * Validation , isBase64 , isValidBase64 ) where import Data.Base64.Types import Data.Bifunctor (first) import Data.ByteString.Lazy (ByteString) import qualified Data.ByteString.Lazy.Base64 as BL64 import qualified Data.Text as T import Data.Text.Encoding.Base64.Error import qualified Data.Text.Lazy as TL import qualified Data.Text.Lazy.Encoding as TL -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'TL.Text' value in Base64 with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "Sun" -- "U3Vu" -- encodeBase64 :: TL.Text -> Base64 'StdPadded TL.Text encodeBase64 = BL64.encodeBase64 . TL.encodeUtf8 {-# INLINE encodeBase64 #-} -- | Decode a padded Base64-encoded 'TL.Text' value -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'StdPadded "U3Vu" -- "Sun" -- decodeBase64 :: StdAlphabet k => Base64 k TL.Text -> TL.Text decodeBase64 = TL.decodeUtf8 . BL64.decodeBase64 . fmap TL.encodeUtf8 {-# INLINE decodeBase64 #-} -- | Decode a padded, untyped Base64-encoded 'TL.Text' value -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "U3Vu" -- Right "Sun" -- -- >>> decodeBase64Untyped "U3V" -- Left "Base64-encoded bytestring requires padding" -- -- >>> decodeBase64Untyped "U3V=" -- Left "non-canonical encoding detected at offset: 2" -- decodeBase64Untyped :: TL.Text -> Either T.Text TL.Text decodeBase64Untyped = fmap TL.decodeUtf8 . BL64.decodeBase64Untyped . TL.encodeUtf8 {-# INLINE decodeBase64Untyped #-} -- | Attempt to decode a 'ByteString' value as Base64, converting from -- 'ByteString' to 'TL.Text' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Example__: -- -- @ -- 'decodeBase64UntypedWith' 'TL.decodeUtf8'' -- :: 'ByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'TL.Text' -- @ -- decodeBase64UntypedWith :: (ByteString -> Either err TL.Text) -- ^ convert a bytestring to text (e.g. 'TL.decodeUtf8'') -> ByteString -- ^ Input text to decode -> Either (Base64Error err) TL.Text decodeBase64UntypedWith f t = case BL64.decodeBase64Untyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64UntypedWith #-} -- | Leniently decode an untyped Base64-encoded 'TL.Text' value. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "U3Vu" -- "Sun" -- -- >>> decodeBase64Lenient "U3V" -- "Su" -- -- >>> decodeBase64Lenient "U3V=" -- "Su" -- decodeBase64Lenient :: TL.Text -> TL.Text decodeBase64Lenient = TL.decodeUtf8 . BL64.decodeBase64Lenient . TL.encodeUtf8 {-# INLINE decodeBase64Lenient #-} -- | Tell whether an untyped 'TL.Text' value is Base64-encoded. -- -- === __Examples__: -- -- >>> isBase64 "U3Vu" -- True -- -- >>> isBase64 "U3V" -- False -- -- >>> isBase64 "U3V=" -- False -- isBase64 :: TL.Text -> Bool isBase64 = BL64.isBase64 . TL.encodeUtf8 {-# INLINE isBase64 #-} -- | Tell whether an untyped 'TL.Text' value is a valid Base64 format. -- -- This will not tell you whether or not this is a correct Base64 representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'TL.Text' value, use 'isBase64'. -- -- === __Examples__: -- -- >>> isValidBase64 "U3Vu" -- True -- -- >>> isValidBase64 "U3V" -- True -- -- >>> isValidBase64 "U3V=" -- True -- -- >>> isValidBase64 "%" -- False -- isValidBase64 :: TL.Text -> Bool isValidBase64 = BL64.isValidBase64 . TL.encodeUtf8 {-# INLINE isValidBase64 #-} base64-1.0/src/Data/Text/Lazy/Encoding/Base64/0000755000000000000000000000000007346545000016704 5ustar0000000000000000base64-1.0/src/Data/Text/Lazy/Encoding/Base64/URL.hs0000644000000000000000000002316507346545000017711 0ustar0000000000000000{-# LANGUAGE DataKinds #-} -- | -- Module : Data.Text.Lazy.Encoding.Base64.URL -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.Text.Lazy.Text'-valued combinators for -- implementing the RFC 4648 specification of the Base64url -- encoding format. This includes strictly padded/unpadded and lenient decoding -- variants, as well as internal and external validation for canonicity. -- module Data.Text.Lazy.Encoding.Base64.URL ( -- * Encoding encodeBase64 , encodeBase64Unpadded -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64UntypedWith , decodeBase64Unpadded , decodeBase64UnpaddedUntyped , decodeBase64UnpaddedUntypedWith , decodeBase64Padded , decodeBase64PaddedUntyped , decodeBase64PaddedUntypedWith , decodeBase64Lenient -- * Validation , isBase64Url , isValidBase64Url ) where import Data.Base64.Types import Data.Bifunctor (first) import Data.ByteString.Lazy (ByteString) import qualified Data.ByteString.Lazy.Base64.URL as BL64U import qualified Data.Text as T import Data.Text.Encoding.Base64.Error import qualified Data.Text.Lazy as TL import qualified Data.Text.Lazy.Encoding as TL -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'TL.Text' value in Base64url with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "<>" -- "PDw_Pj4=" -- encodeBase64 :: TL.Text -> Base64 'UrlPadded TL.Text encodeBase64 = BL64U.encodeBase64 . TL.encodeUtf8 {-# INLINE encodeBase64 #-} -- | Decode an arbitrarily Base64url-encoded 'TL.Text' value. -- -- For typed values: -- - If a padded value is required, use 'decodeBase64Padded' -- - If an unpadded value is required, use 'decodeBase64Unpadded' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64 $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64 :: UrlAlphabet k => Base64 k TL.Text -> TL.Text decodeBase64 = TL.decodeUtf8 . BL64U.decodeBase64 . fmap TL.encodeUtf8 {-# INLINE decodeBase64 #-} -- | Decode an untyped Base64url-encoded 'TL.Text' value. If its length is not a multiple -- of 4, then padding chars will be added to fill out the input to a multiple of -- 4 for safe decoding as base64url encodings are optionally padded. -- -- For a decoder that fails on unpadded input, use 'decodeBase64Unpadded'. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "PDw_Pj4=" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw-Pg=" -- Left "Base64-encoded bytestring has invalid padding" -- -- >>> decodeBase64Untyped "PDw-Pg" -- Right "<<>>" -- decodeBase64Untyped :: TL.Text -> Either T.Text TL.Text decodeBase64Untyped = fmap TL.decodeUtf8 . BL64U.decodeBase64Untyped . TL.encodeUtf8 {-# INLINE decodeBase64Untyped #-} -- | Attempt to decode an untyped lazy 'ByteString' value as Base64url, converting from -- 'ByteString' to 'TL.Text' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Examples__: -- -- @ -- 'decodeBase64With' 'TL.decodeUtf8'' -- :: 'ByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'TL.Text' -- @ -- decodeBase64UntypedWith :: (ByteString -> Either err TL.Text) -- ^ convert a bytestring to text (e.g. 'TL.decodeUtf8'') -> ByteString -- ^ Input text to decode -> Either (Base64Error err) TL.Text decodeBase64UntypedWith f t = case BL64U.decodeBase64Untyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64UntypedWith #-} -- | Encode a 'TL.Text' value in Base64url without padding. Note that for Base64url, -- padding is optional. If you call this function, you will simply be encoding -- as Base64url and stripping padding chars from the output. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64Unpadded "<>" -- "PDw_Pj4" -- encodeBase64Unpadded :: TL.Text -> Base64 'UrlUnpadded TL.Text encodeBase64Unpadded = BL64U.encodeBase64Unpadded . TL.encodeUtf8 {-# INLINE encodeBase64Unpadded #-} -- | Decode an unpadded Base64url encoded 'Text' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Unpadded $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64Unpadded :: Base64 'UrlUnpadded TL.Text -> TL.Text decodeBase64Unpadded = TL.decodeUtf8 . BL64U.decodeBase64Unpadded . fmap TL.encodeUtf8 {-# INLINE decodeBase64Unpadded #-} -- | Decode an unpadded, untyped Base64url encoded 'TL.Text' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4=" -- Left "Base64-encoded bytestring has invalid padding" -- decodeBase64UnpaddedUntyped :: TL.Text -> Either T.Text TL.Text decodeBase64UnpaddedUntyped = fmap TL.decodeUtf8 . BL64U.decodeBase64UnpaddedUntyped . TL.encodeUtf8 {-# INLINE decodeBase64UnpaddedUntyped #-} -- | Attempt to decode an unpadded, untyped lazy 'ByteString' value as Base64url, converting from -- 'ByteString' to 'TL.Text' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Examples__: -- -- @ -- 'decodeBase64UnpaddedUntypedWith' 'TL.decodeUtf8'' -- :: 'ByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'TL.Text' -- @ -- decodeBase64UnpaddedUntypedWith :: (ByteString -> Either err TL.Text) -- ^ convert a bytestring to text (e.g. 'TL.decodeUtf8'') -> ByteString -- ^ Input text to decode -> Either (Base64Error err) TL.Text decodeBase64UnpaddedUntypedWith f t = case BL64U.decodeBase64UnpaddedUntyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64UnpaddedUntypedWith #-} -- | Decode a padded Base64url encoded 'TL.Text' value -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Padded $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- decodeBase64Padded :: Base64 'UrlPadded TL.Text -> TL.Text decodeBase64Padded = TL.decodeUtf8 . BL64U.decodeBase64Padded . fmap TL.encodeUtf8 {-# INLINE decodeBase64Padded #-} -- | Decode an untyped, padded Base64url encoded 'Text' value -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64PaddedUntyped "PDw_Pj4=" -- Right "<>" -- decodeBase64PaddedUntyped :: TL.Text -> Either T.Text TL.Text decodeBase64PaddedUntyped = fmap TL.decodeUtf8 . BL64U.decodeBase64PaddedUntyped . TL.encodeUtf8 {-# inline decodeBase64PaddedUntyped #-} -- | Attempt to decode a padded, untyped lazy 'ByteString' value as Base64url, converting from -- 'ByteString' to 'TL.Text' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Example__: -- -- @ -- 'decodeBase64PaddedWith' 'T.decodeUtf8'' -- :: 'ByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'Text' -- @ -- decodeBase64PaddedUntypedWith :: (ByteString -> Either err TL.Text) -- ^ convert a bytestring to text (e.g. 'TL.decodeUtf8'') -> ByteString -- ^ Input text to decode -> Either (Base64Error err) TL.Text decodeBase64PaddedUntypedWith f t = case BL64U.decodeBase64PaddedUntyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64PaddedUntypedWith #-} -- | Leniently decode an untyped Base64url-encoded 'TL.Text'. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64Lenient "PDw_%%%$}Pj4" -- "<>" -- decodeBase64Lenient :: TL.Text -> TL.Text decodeBase64Lenient = TL.decodeUtf8 . BL64U.decodeBase64Lenient . TL.encodeUtf8 {-# INLINE decodeBase64Lenient #-} -- | Tell whether an untyped 'TL.Text' value is Base64url-encoded -- -- === __Examples__: -- -- >>> isBase64Url "PDw_Pj4=" -- True -- -- >>> isBase64Url "PDw_Pj4" -- True -- -- >>> isBase64Url "PDw_Pj" -- False -- isBase64Url :: TL.Text -> Bool isBase64Url = BL64U.isBase64Url . TL.encodeUtf8 {-# INLINE isBase64Url #-} -- | Tell whether an untyped 'TL.Text' value is a valid Base64url format. -- -- This will not tell you whether or not this is a correct Base64url representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'TL.Text' value, use 'isBase64Url'. -- -- === __Examples__: -- -- >>> isValidBase64Url "PDw_Pj4=" -- True -- -- >>> isValidBase64Url "PDw_Pj" -- True -- -- >>> isValidBase64Url "%" -- False -- isValidBase64Url :: TL.Text -> Bool isValidBase64Url = BL64U.isValidBase64Url . TL.encodeUtf8 {-# INLINE isValidBase64Url #-} base64-1.0/src/Data/Text/Short/Encoding/0000755000000000000000000000000007346545000016040 5ustar0000000000000000base64-1.0/src/Data/Text/Short/Encoding/Base64.hs0000644000000000000000000001145007346545000017421 0ustar0000000000000000{-# LANGUAGE DataKinds #-} -- | -- Module : Data.Text.Short.Encoding.Base64 -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.Text.Short.ShortText'-valued combinators -- implementing the RFC 4648 specification for the Base64 -- encoding format. This includes lenient decoding variants, and -- external + internal validations for canonicity. -- module Data.Text.Short.Encoding.Base64 ( -- * Encoding encodeBase64 -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64UntypedWith , decodeBase64Lenient -- * Validation , isBase64 , isValidBase64 ) where import Data.Base64.Types import Data.Bifunctor (first) import qualified Data.ByteString.Base64 as B64 import Data.ByteString.Short (ShortByteString) import qualified Data.ByteString.Short.Base64 as BS64 import Data.Text (Text) import qualified Data.Text.Encoding.Base64 as B64T import Data.Text.Encoding.Base64.Error import Data.Text.Short import Data.Text.Short.Unsafe -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'ShortText' value in Base64 with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "Sun" -- "U3Vu" -- encodeBase64 :: ShortText -> Base64 'StdPadded ShortText encodeBase64 = fmap fromByteStringUnsafe . B64.encodeBase64' . toByteString {-# INLINE encodeBase64 #-} -- | Decode a padded Base64-encoded 'ShortText' value -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'StdPadded "U3Vu" -- "Sun" -- decodeBase64 :: StdAlphabet k => Base64 k ShortText -> ShortText decodeBase64 = fromText . B64T.decodeBase64 . fmap toText {-# INLINE decodeBase64 #-} -- | Decode a padded Base64-encoded 'ShortText' value -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "U3Vu" -- Right "Sun" -- -- >>> decodeBase64Untyped "U3V" -- Left "Base64-encoded bytestring requires padding" -- -- >>> decodeBase64Untyped "U3V=" -- Left "non-canonical encoding detected at offset: 2" -- decodeBase64Untyped :: ShortText -> Either Text ShortText decodeBase64Untyped = fmap fromText . B64T.decodeBase64Untyped . toText {-# INLINE decodeBase64Untyped #-} -- | Attempt to decode an untyped 'ShortByteString' value as Base64, converting from -- 'ByteString' to 'ShortText' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Example__: -- -- @ -- 'decodeBase64UntypedWith' 'T.decodeUtf8'' -- :: 'ShortByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'ShortText' -- @ -- decodeBase64UntypedWith :: (ShortByteString -> Either err ShortText) -- ^ convert a bytestring to text (e.g. 'T.decodeUtf8'') -> ShortByteString -- ^ Input text to decode -> Either (Base64Error err) ShortText decodeBase64UntypedWith f t = case BS64.decodeBase64Untyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64UntypedWith #-} -- | Leniently decode an untyped Base64-encoded 'ShortText' value. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "U3Vu" -- "Sun" -- -- >>> decodeBase64Lenient "U3V" -- "Su" -- -- >>> decodeBase64Lenient "U3V=" -- "Su" -- decodeBase64Lenient :: ShortText -> ShortText decodeBase64Lenient = fromText . B64T.decodeBase64Lenient . toText {-# INLINE decodeBase64Lenient #-} -- | Tell whether an untyped 'ShortText' value is Base64-encoded. -- -- === __Examples__: -- -- >>> isBase64 "U3Vu" -- True -- -- >>> isBase64 "U3V" -- False -- -- >>> isBase64 "U3V=" -- False -- isBase64 :: ShortText -> Bool isBase64 = B64.isBase64 . toByteString {-# INLINE isBase64 #-} -- | Tell whether an untyped 'ShortText' value is a valid Base64 format. -- -- This will not tell you whether or not this is a correct Base64 representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'ShortText' value, use 'isBase64'. -- -- === __Examples__: -- -- >>> isValidBase64 "U3Vu" -- True -- -- >>> isValidBase64 "U3V" -- True -- -- >>> isValidBase64 "U3V=" -- True -- -- >>> isValidBase64 "%" -- False -- isValidBase64 :: ShortText -> Bool isValidBase64 = B64.isValidBase64 . toByteString {-# INLINE isValidBase64 #-} base64-1.0/src/Data/Text/Short/Encoding/Base64/0000755000000000000000000000000007346545000017064 5ustar0000000000000000base64-1.0/src/Data/Text/Short/Encoding/Base64/URL.hs0000644000000000000000000002407107346545000020066 0ustar0000000000000000{-# LANGUAGE DataKinds #-} -- | -- Module : Data.Text.Short.Encoding.Base64.URL -- Copyright : (c) 2019-2023 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : stable -- Portability : non-portable -- -- This module contains 'Data.Text.Short.ShortText'-valued combinators -- implementing the RFC 4648 specification for the Base64url -- encoding format. This includes strictly padded/unpadded and lenient -- decoding variants, and external + internal validations for canonicity. -- module Data.Text.Short.Encoding.Base64.URL ( -- * Encoding encodeBase64 , encodeBase64Unpadded -- * Decoding , decodeBase64 , decodeBase64Untyped , decodeBase64UntypedWith , decodeBase64Unpadded , decodeBase64UnpaddedUntyped , decodeBase64UnpaddedUntypedWith , decodeBase64Padded , decodeBase64PaddedUntyped , decodeBase64PaddedUntypedWith , decodeBase64Lenient -- * Validation , isBase64Url , isValidBase64Url ) where import Data.Base64.Types import Data.Bifunctor (first) import qualified Data.ByteString.Base64.URL as B64U import Data.ByteString.Short (ShortByteString) import qualified Data.ByteString.Short.Base64.URL as BS64U import Data.Text (Text) import qualified Data.Text.Encoding.Base64.URL as B64TU import Data.Text.Encoding.Base64.Error import Data.Text.Short import Data.Text.Short.Unsafe -- $setup -- -- >>> import Data.Base64.Types -- >>> :set -XOverloadedStrings -- >>> :set -XTypeApplications -- >>> :set -XDataKinds -- -- | Encode a 'ShortText' value in Base64url with padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64 "<>" -- "PDw_Pj4=" -- encodeBase64 :: ShortText -> Base64 'UrlPadded ShortText encodeBase64 = fmap fromByteStringUnsafe . B64U.encodeBase64' . toByteString {-# INLINE encodeBase64 #-} -- | Decode an arbitrarily padded Base64url-encoded 'ShortText' value. -- -- For typed values: -- - If a padded value is required, use 'decodeBase64Padded' -- - If an unpadded value is required, use 'decodeBase64Unpadded' -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64 $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64 $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64 :: UrlAlphabet k => Base64 k ShortText -> ShortText decodeBase64 = fromText . B64TU.decodeBase64 . fmap toText {-# INLINE decodeBase64 #-} -- | Decode an untyped padded Base64url-encoded 'ShortText' value. If its length is not a multiple -- of 4, then padding chars will be added to fill out the input to a multiple of -- 4 for safe decoding as base64url encodings are optionally padded. -- -- For a decoder that fails on unpadded input, use 'decodeBase64Unpadded'. -- -- /Note:/ This function makes sure that decoding is total by deferring to -- 'T.decodeUtf8'. This will always round trip for any valid Base64-encoded -- text value, but it may not round trip for bad inputs. The onus is on the -- caller to make sure inputs are valid. If unsure, defer to `decodeBase64With` -- and pass in a custom decode function. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Untyped "PDw_Pj4=" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64Untyped "PDw-Pg=" -- Left "Base64-encoded bytestring has invalid padding" -- -- >>> decodeBase64Untyped "PDw-Pg" -- Right "<<>>" -- decodeBase64Untyped :: ShortText -> Either Text ShortText decodeBase64Untyped = fmap fromText . B64TU.decodeBase64Untyped . toText {-# INLINE decodeBase64Untyped #-} -- | Attempt to decode an untyped 'ShortByteString' value as Base64url, converting from -- 'ByteString' to 'ShortText' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Examples__: -- -- @ -- 'decodeBase64With' 'T.decodeUtf8'' -- :: 'ShortByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'ShortText' -- @ -- decodeBase64UntypedWith :: (ShortByteString -> Either err ShortText) -- ^ convert a bytestring to text (e.g. 'T.decodeUtf8'') -> ShortByteString -- ^ Input text to decode -> Either (Base64Error err) ShortText decodeBase64UntypedWith f t = case BS64U.decodeBase64Untyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64UntypedWith #-} -- | Encode a 'ShortText' value in Base64url without padding. -- -- See: -- -- === __Examples__: -- -- >>> encodeBase64Unpadded "<>" -- "PDw_Pj4" -- encodeBase64Unpadded :: ShortText -> Base64 'UrlUnpadded ShortText encodeBase64Unpadded = fmap fromByteStringUnsafe . B64U.encodeBase64Unpadded' . toByteString {-# INLINE encodeBase64Unpadded #-} -- | Decode an unpadded Base64url encoded 'ShortText' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Unpadded $ assertBase64 @'UrlUnpadded "PDw_Pj4" -- "<>" -- decodeBase64Unpadded :: Base64 'UrlUnpadded ShortText -> ShortText decodeBase64Unpadded = fromText . B64TU.decodeBase64Unpadded . fmap toText {-# INLINE decodeBase64Unpadded #-} -- | Decode an untyped, unpadded Base64url encoded 'ShortText' value. -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4" -- Right "<>" -- -- >>> decodeBase64UnpaddedUntyped "PDw_Pj4=" -- Left "Base64-encoded bytestring has invalid padding" -- decodeBase64UnpaddedUntyped :: ShortText -> Either Text ShortText decodeBase64UnpaddedUntyped = fmap fromText . B64TU.decodeBase64UnpaddedUntyped . toText {-# INLINE decodeBase64UnpaddedUntyped #-} -- | Attempt to decode an untyped, unpadded 'ShortByteString' value as Base64url, converting from -- 'ShortByteString' to 'ShortText' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Examples__: -- -- @ -- 'decodeBase64UnpaddedWith' 'T.decodeUtf8'' -- :: 'ShortByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'ShortText' -- @ -- decodeBase64UnpaddedUntypedWith :: (ShortByteString -> Either err ShortText) -- ^ convert a bytestring to text (e.g. 'T.decodeUtf8'') -> ShortByteString -- ^ Input text to decode -> Either (Base64Error err) ShortText decodeBase64UnpaddedUntypedWith f t = case BS64U.decodeBase64UnpaddedUntyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64UnpaddedUntypedWith #-} -- | Decode a padded Base64url encoded 'ShortText' value -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64Padded $ assertBase64 @'UrlPadded "PDw_Pj4=" -- "<>" -- decodeBase64Padded :: Base64 'UrlPadded ShortText -> ShortText decodeBase64Padded = fromText . B64TU.decodeBase64Padded . fmap toText {-# INLINE decodeBase64Padded #-} -- | Decode an untyped, padded Base64url encoded 'ShortText' value -- -- See: -- -- === __Examples__: -- -- >>> decodeBase64PaddedUntyped "PDw_Pj4=" -- Right "<>" -- -- >>> decodeBase64PaddedUntyped "PDw_Pj4" -- Left "Base64-encoded bytestring requires padding" -- decodeBase64PaddedUntyped :: ShortText -> Either Text ShortText decodeBase64PaddedUntyped = fmap fromText . B64TU.decodeBase64PaddedUntyped . toText {-# INLINE decodeBase64PaddedUntyped #-} -- | Attempt to decode an untyped, padded 'ShortByteString' value as Base64url, converting from -- 'ByteString' to 'ShortText' according to some encoding function. In practice, -- This is something like 'decodeUtf8'', which may produce an error. -- -- See: -- -- === __Examples__: -- -- @ -- 'decodeBase64With' 'T.decodeUtf8'' -- :: 'ShortByteString' -> 'Either' ('Base64Error' 'UnicodeException') 'ShortText' -- @ -- decodeBase64PaddedUntypedWith :: (ShortByteString -> Either err ShortText) -- ^ convert a bytestring to text (e.g. 'T.decodeUtf8'') -> ShortByteString -- ^ Input text to decode -> Either (Base64Error err) ShortText decodeBase64PaddedUntypedWith f t = case BS64U.decodeBase64PaddedUntyped t of Left de -> Left $ DecodeError de Right a -> first ConversionError (f a) {-# INLINE decodeBase64PaddedUntypedWith #-} -- | Leniently decode an untyped, unpadded Base64url-encoded 'ShortText'. This function -- will not generate parse errors. If input data contains padding chars, -- then the input will be parsed up until the first pad character. -- -- __Note:__ This is not RFC 4648-compliant. -- -- === __Examples__: -- -- >>> decodeBase64Lenient "PDw_Pj4=" -- "<>" -- -- >>> decodeBase64Lenient "PDw_%%%$}Pj4" -- "<>" -- decodeBase64Lenient :: ShortText -> ShortText decodeBase64Lenient = fromText . B64TU.decodeBase64Lenient . toText {-# INLINE decodeBase64Lenient #-} -- | Tell whether an untyped 'ShortText' value is Base64url-encoded. -- -- === __Examples__: -- -- >>> isBase64Url "PDw_Pj4=" -- True -- -- >>> isBase64Url "PDw_Pj4" -- True -- -- >>> isBase64Url "PDw_Pj" -- False -- isBase64Url :: ShortText -> Bool isBase64Url = B64U.isBase64Url . toByteString {-# INLINE isBase64Url #-} -- | Tell whether an untyped 'ShortText' value is a valid Base64url format. -- -- This will not tell you whether or not this is a correct Base64url representation, -- only that it conforms to the correct shape. To check whether it is a true -- Base64 encoded 'ShortText' value, use 'isBase64Url'. -- -- === __Examples__: -- -- >>> isValidBase64Url "PDw_Pj4=" -- True -- -- >>> isValidBase64Url "PDw_Pj" -- True -- -- >>> isValidBase64Url "%" -- False -- isValidBase64Url :: ShortText -> Bool isValidBase64Url = B64U.isValidBase64Url . toByteString {-# INLINE isValidBase64Url #-} base64-1.0/test/0000755000000000000000000000000007346545000011606 5ustar0000000000000000base64-1.0/test/Internal.hs0000644000000000000000000002344007346545000013721 0ustar0000000000000000{-# LANGUAGE PackageImports #-} {-# LANGUAGE DataKinds #-} {-# LANGUAGE RankNTypes #-} {-# LANGUAGE TypeApplications #-} {-# OPTIONS_GHC -fno-warn-orphans #-} -- | -- Module : Main -- Copyright : (c) 2019-2020 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : Experimental -- Portability : portable -- -- This module contains internal test harnesses for `base64` -- module Internal ( Harness(..) , b64 , lb64 , sb64 , t64 , tl64 , ts64 , TextHarness(..) , tt64 , ttl64 , tts64 ) where import "base64" Data.Base64.Types import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as LBS import qualified Data.ByteString.Short as SBS import "base64" Data.ByteString.Base64 as B64 import "base64" Data.ByteString.Base64.URL as B64U import "base64" Data.ByteString.Lazy.Base64 as LB64 import "base64" Data.ByteString.Lazy.Base64.URL as LB64U import "base64" Data.ByteString.Short.Base64 as SB64 import "base64" Data.ByteString.Short.Base64.URL as SB64U import Data.Text (Text) import qualified Data.Text as T import "base64" Data.Text.Encoding.Base64 as T64 import "base64" Data.Text.Encoding.Base64.URL as T64U import Data.Text.Encoding.Base64.Error (Base64Error(..)) import qualified Data.Text.Lazy as TL import "base64" Data.Text.Lazy.Encoding.Base64 as TL64 import "base64" Data.Text.Lazy.Encoding.Base64.URL as TL64U import qualified Data.Text.Short as TS import "base64" Data.Text.Short.Encoding.Base64 as TS64 import "base64" Data.Text.Short.Encoding.Base64.URL as TS64U import Test.QuickCheck hiding (label) -- ------------------------------------------------------------------ -- -- Test Harnesses -- | This dictionary provides the generic API definition for -- the base64 std alphabet -- data Harness bs = Harness { label :: String , encode :: bs -> Base64 'StdPadded bs , encodeUrl :: bs -> Base64 'UrlPadded bs , encodeUrlNopad :: bs -> Base64 'UrlUnpadded bs , decode :: bs -> Either Text bs , decodeTyped :: Base64 'StdPadded bs -> bs , decodeUrl :: bs -> Either Text bs , decodeUrlPad :: bs -> Either Text bs , decodeUrlNopad :: bs -> Either Text bs , decodeUrlTyped :: forall k. UrlAlphabet k => Base64 k bs -> bs , decodeUrlTypedPad :: Base64 'UrlPadded bs -> bs , decodeUrlTypedNopad :: Base64 'UrlUnpadded bs -> bs , lenientUrl :: bs -> bs , lenient :: bs -> bs , correct :: bs -> Bool , correctUrl :: bs -> Bool , validate :: bs -> Bool , validateUrl :: bs -> Bool } b64 :: Harness BS.ByteString b64 = Harness { label = "ByteString" , encode = B64.encodeBase64' , decode = B64.decodeBase64Untyped , decodeTyped = B64.decodeBase64 , lenient = B64.decodeBase64Lenient , correct = B64.isBase64 , validate = B64.isValidBase64 , encodeUrl = B64U.encodeBase64' , encodeUrlNopad = B64U.encodeBase64Unpadded' , decodeUrl = B64U.decodeBase64Untyped , decodeUrlPad = B64U.decodeBase64PaddedUntyped , decodeUrlNopad = B64U.decodeBase64UnpaddedUntyped , decodeUrlTyped = B64U.decodeBase64 , decodeUrlTypedPad = B64U.decodeBase64Padded , decodeUrlTypedNopad = B64U.decodeBase64Unpadded , lenientUrl = B64U.decodeBase64Lenient , correctUrl = B64U.isBase64Url , validateUrl = B64U.isValidBase64Url } lb64 :: Harness LBS.ByteString lb64 = Harness { label = "Lazy ByteString" , encode = LB64.encodeBase64' , decode = LB64.decodeBase64Untyped , decodeTyped = LB64.decodeBase64 , lenient = LB64.decodeBase64Lenient , correct = LB64.isBase64 , validate = LB64.isValidBase64 , encodeUrl = LB64U.encodeBase64' , encodeUrlNopad = LB64U.encodeBase64Unpadded' , decodeUrl = LB64U.decodeBase64Untyped , decodeUrlPad = LB64U.decodeBase64PaddedUntyped , decodeUrlNopad = LB64U.decodeBase64UnpaddedUntyped , decodeUrlTyped = LB64U.decodeBase64 , decodeUrlTypedPad = LB64U.decodeBase64Padded , decodeUrlTypedNopad = LB64U.decodeBase64Unpadded , lenientUrl = LB64U.decodeBase64Lenient , correctUrl = LB64U.isBase64Url , validateUrl = LB64U.isValidBase64Url } sb64 :: Harness SBS.ShortByteString sb64 = Harness { label = "Short ByteString" , encode = SB64.encodeBase64' , decode = SB64.decodeBase64Untyped , decodeTyped = SB64.decodeBase64 , lenient = SB64.decodeBase64Lenient , correct = SB64.isBase64 , validate = SB64.isValidBase64 , encodeUrl = SB64U.encodeBase64' , encodeUrlNopad = SB64U.encodeBase64Unpadded' , decodeUrl = SB64U.decodeBase64Untyped , decodeUrlPad = SB64U.decodeBase64PaddedUntyped , decodeUrlNopad = SB64U.decodeBase64UnpaddedUntyped , decodeUrlTyped = SB64U.decodeBase64 , decodeUrlTypedPad = SB64U.decodeBase64Padded , decodeUrlTypedNopad = SB64U.decodeBase64Unpadded , lenientUrl = SB64U.decodeBase64Lenient , correctUrl = SB64U.isBase64Url , validateUrl = SB64U.isValidBase64Url } t64 :: Harness Text t64 = Harness { label = "Text" , encode = T64.encodeBase64 , decode = T64.decodeBase64Untyped , decodeTyped = T64.decodeBase64 , lenient = T64.decodeBase64Lenient , correct = T64.isBase64 , validate = T64.isValidBase64 , encodeUrl = T64U.encodeBase64 , encodeUrlNopad = T64U.encodeBase64Unpadded , decodeUrl = T64U.decodeBase64Untyped , decodeUrlPad = T64U.decodeBase64PaddedUntyped , decodeUrlNopad = T64U.decodeBase64UnpaddedUntyped , decodeUrlTyped = T64U.decodeBase64 , decodeUrlTypedPad = T64U.decodeBase64Padded , decodeUrlTypedNopad = T64U.decodeBase64Unpadded , lenientUrl = T64U.decodeBase64Lenient , correctUrl = T64U.isBase64Url , validateUrl = T64U.isValidBase64Url } tl64 :: Harness TL.Text tl64 = Harness { label = "Lazy Text" , encode = TL64.encodeBase64 , decode = TL64.decodeBase64Untyped , decodeTyped = TL64.decodeBase64 , lenient = TL64.decodeBase64Lenient , correct = TL64.isBase64 , validate = TL64.isValidBase64 , encodeUrl = TL64U.encodeBase64 , encodeUrlNopad = TL64U.encodeBase64Unpadded , decodeUrl = TL64U.decodeBase64Untyped , decodeUrlPad = TL64U.decodeBase64PaddedUntyped , decodeUrlNopad = TL64U.decodeBase64UnpaddedUntyped , decodeUrlTyped = TL64U.decodeBase64 , decodeUrlTypedPad = TL64U.decodeBase64Padded , decodeUrlTypedNopad = TL64U.decodeBase64Unpadded , lenientUrl = TL64U.decodeBase64Lenient , correctUrl = TL64U.isBase64Url , validateUrl = TL64U.isValidBase64Url } ts64 :: Harness TS.ShortText ts64 = Harness { label = "Short Text" , encode = TS64.encodeBase64 , decode = TS64.decodeBase64Untyped , decodeTyped = TS64.decodeBase64 , lenient = TS64.decodeBase64Lenient , correct = TS64.isBase64 , validate = TS64.isValidBase64 , encodeUrl = TS64U.encodeBase64 , encodeUrlNopad = TS64U.encodeBase64Unpadded , decodeUrl = TS64U.decodeBase64Untyped , decodeUrlPad = TS64U.decodeBase64PaddedUntyped , decodeUrlNopad = TS64U.decodeBase64UnpaddedUntyped , decodeUrlTyped = TS64U.decodeBase64 , decodeUrlTypedPad = TS64U.decodeBase64Padded , decodeUrlTypedNopad = TS64U.decodeBase64Unpadded , lenientUrl = TS64U.decodeBase64Lenient , correctUrl = TS64U.isBase64Url , validateUrl = TS64U.isValidBase64Url } -- -------------------------------------------------------------------- -- -- Text-specific harness data TextHarness bs cs = TextHarness { decodeWith_ :: forall err. (bs -> Either err cs) -> bs -> Either (Base64Error err) cs , decodeUrlWith_ :: forall err. (bs -> Either err cs) -> bs -> Either (Base64Error err) cs , decodeUrlPaddedWith_ :: forall err. (bs -> Either err cs) -> bs -> Either (Base64Error err) cs , decodeUrlUnpaddedWith_ :: forall err. (bs -> Either err cs) -> bs -> Either (Base64Error err) cs } tt64 :: TextHarness BS.ByteString Text tt64 = TextHarness { decodeWith_ = T64.decodeBase64UntypedWith , decodeUrlWith_ = T64U.decodeBase64UntypedWith , decodeUrlPaddedWith_ = T64U.decodeBase64PaddedUntypedWith , decodeUrlUnpaddedWith_ = T64U.decodeBase64UnpaddedUntypedWith } ttl64 :: TextHarness LBS.ByteString TL.Text ttl64 = TextHarness { decodeWith_ = TL64.decodeBase64UntypedWith , decodeUrlWith_ = TL64U.decodeBase64UntypedWith , decodeUrlPaddedWith_ = TL64U.decodeBase64PaddedUntypedWith , decodeUrlUnpaddedWith_ = TL64U.decodeBase64UnpaddedUntypedWith } tts64 :: TextHarness SBS.ShortByteString TS.ShortText tts64 = TextHarness { decodeWith_ = TS64.decodeBase64UntypedWith , decodeUrlWith_ = TS64U.decodeBase64UntypedWith , decodeUrlPaddedWith_ = TS64U.decodeBase64PaddedUntypedWith , decodeUrlUnpaddedWith_ = TS64U.decodeBase64UnpaddedUntypedWith } -- ------------------------------------------------------------------ -- -- Quickcheck instances instance Arbitrary BS.ByteString where arbitrary = BS.pack <$> arbitrary shrink xs = BS.pack <$> shrink (BS.unpack xs) instance CoArbitrary BS.ByteString where coarbitrary = coarbitrary . BS.unpack instance Arbitrary LBS.ByteString where arbitrary = LBS.pack <$> arbitrary shrink xs = LBS.pack <$> shrink (LBS.unpack xs) instance CoArbitrary LBS.ByteString where coarbitrary = coarbitrary . LBS.unpack instance Arbitrary SBS.ShortByteString where arbitrary = SBS.pack <$> arbitrary shrink xs = SBS.pack <$> shrink (SBS.unpack xs) instance CoArbitrary SBS.ShortByteString where coarbitrary = coarbitrary . SBS.unpack instance Arbitrary T.Text where arbitrary = T.pack . getUnicodeString <$> arbitrary shrink xs = T.pack . getUnicodeString <$> shrink (UnicodeString $ T.unpack xs) instance Arbitrary TL.Text where arbitrary = TL.pack . getUnicodeString <$> arbitrary shrink xs = TL.pack . getUnicodeString <$> shrink (UnicodeString $ TL.unpack xs) instance CoArbitrary T.Text where coarbitrary = coarbitrary . T.unpack instance CoArbitrary TL.Text where coarbitrary = coarbitrary . TL.unpack instance Arbitrary TS.ShortText where arbitrary = TS.fromText <$> arbitrary shrink xs = TS.fromText <$> shrink (TS.toText xs) instance CoArbitrary TS.ShortText where coarbitrary = coarbitrary . TS.toText base64-1.0/test/Main.hs0000644000000000000000000004661307346545000013040 0ustar0000000000000000{-# LANGUAGE RecordWildCards #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE PackageImports #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE TypeApplications #-} {-# LANGUAGE DataKinds #-} -- | -- Module : Main -- Copyright : (c) 2019-2020 Emily Pillmore -- License : BSD-style -- -- Maintainer : Emily Pillmore -- Stability : Experimental -- Portability : portable -- -- This module contains the test implementation for the `base64` package -- module Main ( main ) where import Prelude hiding (length) import Data.Base64.Types import Data.Bifunctor (second) import qualified Data.ByteString as BS import Data.ByteString.Internal (c2w) import qualified Data.ByteString.Lazy as LBS import qualified Data.ByteString.Short as SBS import "base64" Data.ByteString.Base64 as B64 import "base64" Data.ByteString.Base64.URL as B64U import qualified "base64-bytestring" Data.ByteString.Base64 as Bos import qualified "base64-bytestring" Data.ByteString.Base64.URL as BosU import qualified Data.Text as T import qualified Data.Text.Encoding as T import Data.Text.Encoding.Base64.Error (Base64Error(..)) import qualified Data.Text.Lazy as TL import qualified Data.Text.Lazy.Encoding as TL import qualified Data.Text.Short as TS import Data.Word import Internal import Test.Tasty import Test.Tasty.HUnit import Test.Tasty.QuickCheck (testProperty) import Data.String (IsString) import Test.QuickCheck hiding (label) main :: IO () main = defaultMain tests tests :: TestTree tests = testGroup "Base64 Tests" [ mkTree b64 [ mkPropTree , mkUnitTree BS.last BS.length ] , mkTree lb64 [ mkPropTree , mkUnitTree LBS.last (fromIntegral . LBS.length) ] , mkTree sb64 [ mkPropTree , mkUnitTree (BS.last . SBS.fromShort) SBS.length ] , mkTree t64 [ mkPropTree , mkUnitTree (c2w . T.last) T.length , mkDecodeTree T.decodeUtf8' tt64 b64 ] , mkTree tl64 [ mkPropTree , mkUnitTree (c2w . TL.last) (fromIntegral . TL.length) , mkDecodeTree TL.decodeUtf8' ttl64 lb64 ] , mkTree ts64 [ mkPropTree , mkUnitTree (c2w . T.last . TS.toText) TS.length , mkDecodeTree (second TS.fromText . T.decodeUtf8' . SBS.fromShort) tts64 sb64 ] ] -- ---------------------------------------------------------------- -- -- Test tree generation -- | Make a test tree for a given label -- mkTree :: ( Arbitrary a , IsString a , Eq a , Show a ) => Harness a -> [Harness a -> TestTree] -> TestTree mkTree a = testGroup (label a) . fmap ($ a) -- | Make a test group with some name, lifting a test tree up to the correct -- type information via some Harness -- mkTests :: ( Arbitrary a , IsString a , Eq a , Show a ) => String -> [Harness a -> TestTree] -> Harness a -> TestTree mkTests context ts = testGroup context . (<*>) ts . pure -- | Make property tests for a given harness instance -- mkPropTree :: (Arbitrary a, IsString a, Eq a, Show a) => Harness a -> TestTree mkPropTree = mkTests "Property Tests" [ prop_roundtrip , prop_correctness , prop_url_padding , const prop_bos_coherence ] -- | Make unit tests for a given harness instance -- mkUnitTree :: (Arbitrary a, IsString a, Eq a, Show a) => (a -> Word8) -> (a -> Int) -> Harness a -> TestTree mkUnitTree last_ length_ = mkTests "Unit tests" [ paddingTests last_ length_ , rfcVectors , offsetVectors , validityTests , canonicityTests ] -- | Make unit tests for textual 'decode*With' functions -- mkDecodeTree :: ( Arbitrary t , Eq t , IsString t , Show t , IsString a , Show e ) => (a -> Either e t) -> TextHarness a t -> Harness a -> Harness t -> TestTree mkDecodeTree utf8 t a = mkTests "Decoding tests" [ decodeWithVectors utf8 t a ] -- ---------------------------------------------------------------- -- -- Property tests prop_roundtrip :: (Arbitrary a, IsString a, Eq a, Show a) => Harness a -> TestTree prop_roundtrip Harness{..} = testGroup "prop_roundtrip" [ testProperty "prop_std_roundtrip" $ \(bs :: a) -> Right bs == decode (extractBase64 $ encode bs) , testProperty "prop_url_roundtrip" $ \(bs :: a) -> Right bs == decodeUrl (extractBase64 $ encodeUrl bs) , testProperty "prop_url_roundtrip_nopad" $ \(bs :: a) -> Right bs == decodeUrlNopad (extractBase64 $ encodeUrlNopad bs) , testProperty "prop_std_lenient_roundtrip" $ \(bs :: a) -> bs == lenient (extractBase64 $ encode bs) , testProperty "prop_url_lenient_roundtrip" $ \(bs :: a) -> bs == lenientUrl (extractBase64 $ encodeUrl bs) ] prop_correctness :: (Arbitrary a, IsString a, Eq a, Show a) => Harness a -> TestTree prop_correctness Harness{..} = testGroup "prop_validity" [ testProperty "prop_std_valid" $ \(bs :: a) -> validate (extractBase64 $ encode bs) , testProperty "prop_url_valid" $ \(bs :: a) -> validateUrl (extractBase64 $ encodeUrl bs) , testProperty "prop_std_correct" $ \(bs :: a) -> correct (extractBase64 $ encode bs) , testProperty "prop_url_correct" $ \(bs :: a) -> correctUrl (extractBase64 $ encodeUrl bs) ] prop_url_padding :: (Arbitrary a, IsString a, Eq a, Show a) => Harness a -> TestTree prop_url_padding Harness{..} = testGroup "prop_url_padding" [ testProperty "prop_url_nopad_roundtrip_untyped" $ \(bs :: a) -> Right bs == decodeUrlNopad (extractBase64 $ encodeUrlNopad bs) , testProperty "prop_url_nopad_roundtrip_typed" $ \(bs :: a) -> bs == decodeUrlTypedNopad (encodeUrlNopad bs) , testProperty "prop_url_pad_roundtrip_untyped" $ \(bs :: a) -> Right bs == decodeUrlPad (extractBase64 $ encodeUrl bs) , testProperty "prop_url_pad_roundtrip_typed" $ \(bs :: a) -> bs == decodeUrlTypedPad (encodeUrl bs) , testProperty "prop_url_decode_invariant_untyped" $ \(bs :: a) -> (decodeUrlNopad (extractBase64 $ encodeUrlNopad bs) == decodeUrl (extractBase64 $ encodeUrlNopad bs)) || (decodeUrlPad (extractBase64 $ encodeUrl bs) == decodeUrl (extractBase64 $ encodeUrl bs)) , testProperty "prop_url_decode_invariant_typed" $ \(bs :: a) -> (decodeUrlTypedNopad (encodeUrlNopad bs) == decodeUrlTyped (encodeUrlNopad bs)) || (decodeUrlPad (extractBase64 $ encodeUrl bs) == decodeUrl (extractBase64 $ encodeUrl bs)) -- NOTE: we need to fix the bitmasking issue for "impossible" -- inputs , testProperty "prop_url_padding_coherence" $ \(bs :: a) -> Right bs == decodeUrl (extractBase64 $ encodeUrl bs) && Right bs == decodeUrlPad (extractBase64 $ encodeUrl bs) , testProperty "prop_url_nopadding_coherence" $ \(bs :: a) -> Right bs == decodeUrlNopad (extractBase64 $ encodeUrlNopad bs) && Right bs == decodeUrl (extractBase64 $ encodeUrlNopad bs) ] -- | just a sanity check against `base64-bytestring` -- prop_bos_coherence :: TestTree prop_bos_coherence = testGroup "prop_bos_coherence" [ testProperty "prop_std_bos_coherence" $ \bs -> Right bs == B64.decodeBase64Untyped (extractBase64 $ B64.encodeBase64' bs) && Right bs == Bos.decode (Bos.encode bs) && bs == B64.decodeBase64 (B64.encodeBase64' bs) , testProperty "prop_url_bos_coherence" $ \bs -> Right bs == B64U.decodeBase64Untyped (extractBase64 $ B64U.encodeBase64' bs) && Right bs == BosU.decode (BosU.encode bs) && bs == B64U.decodeBase64 (B64U.encodeBase64' bs) ] -- ---------------------------------------------------------------- -- -- Unit tests -- | RFC 4648 test vectors -- rfcVectors :: (IsString a, Eq a, Show a) => Harness a -> TestTree rfcVectors Harness{..} = testGroup "RFC 4648 Test Vectors" [ testGroup "std alphabet" [ testCaseStd "" "" , testCaseStd "f" "Zg==" , testCaseStd "f" "Zg==" , testCaseStd "fo" "Zm8=" , testCaseStd "foo" "Zm9v" , testCaseStd "foob" "Zm9vYg==" , testCaseStd "fooba" "Zm9vYmE=" , testCaseStd "foobar" "Zm9vYmFy" ] , testGroup "url-safe alphabet" [ testCaseUrl "" "" , testCaseUrl "<" "PA==" , testCaseUrl "<<" "PDw=" , testCaseUrl "<" "PDw_Pz4=" , testCaseUrl "<>" "PDw_Pz4-" ] ] where testCaseStd s t = testCaseSteps (show $ if s == "" then "empty" else s) $ \step -> do step "encode is sound" assertBase64 t @=? encode s step "decode is sound - untyped" Right s @=? decode (extractBase64 $ encode s) Right s @=? decode t step "decode is sound - typed" s @=? decodeTyped (encode s) s @=? decodeTyped (assertBase64 t) testCaseUrl s t = testCaseSteps (show $ if s == "" then "empty" else s) $ \step -> do step "encode is sound" assertBase64 t @=? encodeUrl s step "decode is sound - untyped" Right s @=? decodeUrl (extractBase64 $ encodeUrl s) Right s @=? decodeUrl t step "decode is sound - typed" s @=? decodeUrlTyped (encodeUrl s) s @=? decodeUrlTypedPad (encodeUrl s) s @=? decodeUrlTyped (assertBase64 @'UrlPadded t) s @=? decodeUrlTypedPad (assertBase64 @'UrlPadded t) -- | Url-safe padding unit tests (stresses entire alphabet) -- paddingTests :: ( IsString a , Eq a , Show a ) => (a -> Word8) -> (a -> Int) -> Harness a -> TestTree paddingTests last_ length_ Harness{..} = testGroup "Padding tests" [ testGroup "URL decodePadding coherence" [ ptest "<" "PA==" , ptest "<<" "PDw=" , ptest "<" "PDw_Pz4=" , ptest "<>" "PDw_Pz4-" ] , testGroup "URL decodeUnpadded coherence" [ utest "<" "PA" , utest "<<" "PDw" , utest "<" "PDw_Pz4" , utest "<>" "PDw_Pz4-" ] , testGroup "url-safe padding case unit tests" [ testCase "stress arbitarily padded URL strings" $ do decodeUrl "P" @=? Left "Base64-encoded bytestring has invalid size" decodeUrl "PA" @=? Right "<" decodeUrl "PDw" @=? Right "<<" decodeUrl "PDw_" @=? Right "< do let u = decodeUrlNopad t v = decodeUrlPad t if last_ t == 0x3d then do step "Padding required: no padding fails" u @=? Left "Base64-encoded bytestring has invalid padding" step "Padding required: padding succeeds" v @=? Right s else do step "String has no padding: decodes should coincide" u @=? Right s v @=? Right s v @=? u utest s t = testCaseSteps (show $ if t == "" then "empty" else t) $ \step -> do let u = decodeUrlPad t v = decodeUrlNopad t if length_ t `mod` 4 == 0 then do step "String has no padding: decodes should coincide" u @=? Right s v @=? Right s v @=? u else do step "Unpadded required: padding fails" u @=? Left "Base64-encoded bytestring requires padding" step "Unpadded required: unpadding succeeds" v @=? Right s -- | Offset test vectors. This stresses the invalid char + incorrect padding -- offset error messages -- offsetVectors :: (IsString a, Eq a, Show a) => Harness a -> TestTree offsetVectors Harness{..} = testGroup "Offset tests" [ testGroup "Invalid padding" [ testCase "Invalid staggered padding" $ do decodeUrl "=A==" @=? Left "invalid padding near offset: 0" decodeUrl "P===" @=? Left "invalid padding near offset: 1" , testCase "Invalid character coverage - final chunk" $ do decodeUrl "%D==" @=? Left "invalid base64 encoding near offset: 0" decodeUrl "P%==" @=? Left "invalid base64 encoding near offset: 1" decodeUrl "PD%=" @=? Left "invalid base64 encoding near offset: 2" decodeUrl "PA=%" @=? Left "invalid base64 encoding near offset: 3" decodeUrl "PDw%" @=? Left "invalid base64 encoding near offset: 3" , testCase "Invalid character coverage - decode chunk" $ do decodeUrl "%Dw_PDw_" @=? Left "invalid base64 encoding near offset: 0" decodeUrl "P%w_PDw_" @=? Left "invalid base64 encoding near offset: 1" decodeUrl "PD%_PDw_" @=? Left "invalid base64 encoding near offset: 2" decodeUrl "PDw%PDw_" @=? Left "invalid base64 encoding near offset: 3" , testCase "Invalid padding in body" $ do decodeUrl "PD=_PDw_" @=? Left "invalid padding near offset: 2" decodeUrl "PDw=PDw_" @=? Left "invalid padding near offset: 3" , testCase "Padding fails everywhere but end" $ do decode "=eAoeAo=" @=? Left "invalid padding near offset: 0" decode "e=AoeAo=" @=? Left "invalid padding near offset: 1" decode "eA=oeAo=" @=? Left "invalid padding near offset: 2" decode "eAo=eAo=" @=? Left "invalid padding near offset: 3" decode "eAoe=Ao=" @=? Left "invalid padding near offset: 4" decode "eAoeA=o=" @=? Left "invalid padding near offset: 5" ] ] canonicityTests :: (IsString a, Eq a, Show a) => Harness a -> TestTree canonicityTests Harness{..} = testGroup "Canonicity unit tests" [ testCase "roundtrip for d ~ ZA==" $ do decode "ZE==" @=? Left "non-canonical encoding detected at offset: 1" decode "ZK==" @=? Left "non-canonical encoding detected at offset: 1" decode "ZA==" @=? Right "d" , testCase "roundtrip for f` ~ ZmA=" $ do decode "ZmC=" @=? Left "non-canonical encoding detected at offset: 2" decode "ZmD=" @=? Left "non-canonical encoding detected at offset: 2" decode "ZmA=" @=? Right "f`" , testCase "roundtrip for foo` ~ Zm9vYA==" $ do decode "Zm9vYE==" @=? Left "non-canonical encoding detected at offset: 5" decode "Zm9vYK==" @=? Left "non-canonical encoding detected at offset: 5" decode "Zm9vYA==" @=? Right "foo`" , testCase "roundtrip for foob` ~ Zm9vYmA=" $ do decode "Zm9vYmC=" @=? Left "non-canonical encoding detected at offset: 6" decode "Zm9vYmD=" @=? Left "non-canonical encoding detected at offset: 6" decode "Zm9vYmA=" @=? Right "foob`" ] -- | Unit test trees for the `decode*With` family of text-valued functions -- decodeWithVectors :: ( IsString a , IsString t , Eq t , Show e , Show t ) => (a -> Either e t) -- ^ utf8 -> TextHarness a t -- ^ witness to the bytestring-ey dictionaries -> Harness a -- ^ witness to the text dictionaries -> Harness t -> TestTree decodeWithVectors utf8 TextHarness{..} h t = testGroup "DecodeWith* unit tests" [ testGroup "decodeWith negative tests" [ testCase "decodeWith non-utf8 inputs on decodeUtf8" $ do case decodeWith_ utf8 "\1079743" of Left (DecodeError _) -> return () _ -> assertFailure "decoding phase" , testCase "decodeWith valid utf8 inputs on decodeUtf8" $ do case decodeWith_ utf8 (extractBase64 $ encode h "\1079743") of Left (ConversionError _) -> return () _ -> assertFailure "conversion phase" , testCase "decodeUrlWith non-utf8 inputs on decodeUtf8" $ do case decodeUrlWith_ utf8 "\1079743" of Left (DecodeError _) -> return () _ -> assertFailure "decoding phase" , testCase "decodeUrlWith valid utf8 inputs on decodeUtf8" $ do case decodeUrlWith_ utf8 (extractBase64 $ encodeUrl h "\1079743") of Left (ConversionError _) -> return () _ -> assertFailure "conversion phase" , testCase "decodeUrlPaddedWith non-utf8 inputs on decodeUtf8" $ do case decodeUrlPaddedWith_ utf8 "\1079743" of Left (DecodeError _) -> return () _ -> assertFailure "decoding phase" , testCase "decodePaddedWith valid utf8 inputs on decodeUtf8" $ do case decodeUrlPaddedWith_ utf8 (extractBase64 $ encodeUrl h "\1079743") of Left (ConversionError _) -> return () _ -> assertFailure "conversion phase" , testCase "decodeUnpaddedWith non-utf8 inputs on decodeUtf8" $ do case decodeUrlUnpaddedWith_ utf8 "\1079743" of Left (DecodeError _) -> return () _ -> assertFailure "decoding phase" , testCase "decodeUnpaddedWith valid utf8 inputs on decodeUtf8" $ do case decodeUrlUnpaddedWith_ utf8 (extractBase64 $ encodeUrlNopad h "\1079743") of Left (ConversionError _) -> return () _ -> assertFailure "conversion phase" ] , testGroup "decodeWith positive tests" [ testCase "decodeWith utf8 inputs on decodeUtf8" $ do a <- either (assertFailure . show) pure $ decode t "Zm9vYmFy" b <- either (assertFailure . show) pure $ decodeWith_ utf8 "Zm9vYmFy" a @=? b , testCase "decodeUrlWith utf8 inputs on decodeUtf8" $ do a <- either (assertFailure . show) pure $ decodeUrl t "PDw_Pz4-" b <- either (assertFailure . show) pure $ decodeUrlWith_ utf8 "PDw_Pz4-" a @=? b , testCase "decodeUrlPaddedWith utf8 inputs on decodeUtf8" $ do a <- either (assertFailure . show) pure $ decodeUrlPad t "PDw_Pz4-" b <- either (assertFailure . show) pure $ decodeUrlPaddedWith_ utf8 "PDw_Pz4-" a @=? b , testCase "decodeUrlUnpaddedWith utf8 inputs on decodeUtf8" $ do a <- either (assertFailure . show) pure $ decodeUrlNopad t "PDw_Pz4-" b <- either (assertFailure . show) pure $ decodeUrlUnpaddedWith_ utf8 "PDw_Pz4-" a @=? b ] ] -- | Validity unit tests for the URL workflow -- validityTests :: IsString a => Harness a -> TestTree validityTests Harness{..} = testGroup "Validity and correctness unit tests" [ testGroup "Validity unit tests" [ testCase "Padding tests" $ do not (validateUrl "P") @? "P" validateUrl "PA" @? "PA" validateUrl "PDw" @? "PDw" validateUrl "PDw_" @? "PDw_" validateUrl "PA==" @? "PA==" validateUrl "PDw=" @? "PDw=" validateUrl "PDw_" @? "PDw_" , testCase "Canonicity tests" $ do validateUrl "ZK==" @? "ZK==" validateUrl "ZE==" @? "ZE==" validateUrl "ZA==" @? "ZA==" validateUrl "ZK==" @? "ZK==" validateUrl "ZK" @? "ZK" validateUrl "ZmA=" @? "ZmA=" validateUrl "ZmC=" @? "ZmC=" validateUrl "ZmE" @? "ZmE" validateUrl "Zm9vYmA=" @? "Zm9vYmA=" validateUrl "Zm9vYmC=" @? "Zm9vYmC=" validateUrl "Zm9vYmC" @? "Zm9vYmC" ] , testGroup "Correctness unit tests" [ testCase "Padding tests" $ do not (validateUrl "P") @? "P" correctUrl "PA" @? "PA" correctUrl "PDw" @? "PDw" correctUrl "PDw_" @? "PDw_" correctUrl "PA==" @? "PA==" correctUrl "PDw=" @? "PDw=" correctUrl "PDw_" @? "PDw_" , testCase "Canonicity tests" $ do not (correctUrl "ZK==") @? "ZK==" not (correctUrl "ZE==") @? "ZE==" correctUrl "ZA==" @? "ZA==" correctUrl "ZmA=" @? "ZmA=" not (correctUrl "ZmC=") @? "ZmC=" not (correctUrl "ZmD") @? "ZmD" correctUrl "Zm9vYmA=" @? "Zm9vYmA=" not (correctUrl "Zm9vYmC=") @? "Zm9vYmC=" not (correctUrl "Zm9vYmC") @? "Zm9vYmC" ] ]