deser-hjson-2.2.4/.cargo_vcs_info.json 0000644 00000000136 00000000001 0013262 0 ustar {
"git": {
"sha1": "ad8b02c186f18438c59d6fab8482a229b0f0d88d"
},
"path_in_vcs": ""
} deser-hjson-2.2.4/.gitignore 0000644 0000000 0000000 00000000064 10461020230 0014042 0 ustar 0000000 0000000 .bacon-locations
/target
Cargo.lock
glassbench_*.db
deser-hjson-2.2.4/CHANGELOG.md 0000644 0000000 0000000 00000003520 10461020230 0013663 0 ustar 0000000 0000000
### v2.2.4 - 2023-11-28
- fix wrong handling of some multiline strings - Fix #19
### v2.2.3 - 2023-11-19
- fix a case of non understood hjson (regression introduced in 2.2.1) - Fix #20
### v2.2.2 - 2023-11-04
- fix non optional boolean value not parsed after a space in a struct - Fix #18
### v2.2.1 - 2023-10-27
- performance improvements
### v2.2.0 - 2023-09-09
- Allow single-quoted identifiers and enum values - thanks @jwnrt
### v2.1.0 - 2023-07-09
- discard trailing whitespaces in quoteless strings
### v2.0.0 - 2023-07-09
- `from_reader` function
- Error type no longer `Clone` and `PartialEq`, flagged `non_exhaustive`
### v1.2.0 - 2023-05-25
- `from_slice` function
### v1.1.1 - 2023-04-22
- accept quotes in "quoteless" keys - Fix #9
### v1.1.0 - 2022-12-21
- support for braceless Hjson - Fix #7
### v1.0.2 - 2021-07-31
- fix tab after quoteless map key being read as part of the key
### v1.0.1 - 2021-06-22
- properly parse single quote strings
- fix type guessing in some cases for null, false, and true
### v1.0.0 - 2021-06-15
- it's stable. Calling it a 1.0
### v0.1.13 - 2021-05-26
- make \r\n behave like \n
- allow more liberty for enum variants
### v0.1.12 - 2021-02-13
- more precise number type guessing
### v0.1.11 - 2021-02-11
- fix primitive types (ie not Hjson texts but primitives like integers and floats) needing a space at the end - Fix #1
### v0.1.10 - 2021-02-11
- make from_str parse a `DeserializeOwned` instead of a borrowed `Deserialize<'a>`
deser-hjson-2.2.4/Cargo.toml 0000644 00000002022 00000000001 0011254 0 ustar # THIS FILE IS AUTOMATICALLY GENERATED BY CARGO
#
# When uploading crates to the registry Cargo will automatically
# "normalize" Cargo.toml files for maximal compatibility
# with all versions of Cargo and also rewrite `path` dependencies
# to registry (e.g., crates.io) dependencies.
#
# If you are reading this file be aware that the original Cargo.toml
# will likely look very different (and much more reasonable).
# See Cargo.toml.orig for the original contents.
[package]
edition = "2018"
name = "deser-hjson"
version = "2.2.4"
authors = ["dystroy "]
description = "a Hjson deserializer for Serde"
readme = "README.md"
keywords = [
"hjson",
"deserialization",
"serde",
"derive",
"json",
]
categories = ["encoding"]
license = "MIT"
repository = "https://github.com/Canop/deser-hjson"
[profile.bench]
lto = true
[profile.release]
lto = true
[[bench]]
name = "parse"
harness = false
[dependencies.serde]
version = "1.0"
features = ["derive"]
[dev-dependencies.glassbench]
version = "0.3.5"
deser-hjson-2.2.4/Cargo.toml.orig 0000644 0000000 0000000 00000001047 10461020230 0014743 0 ustar 0000000 0000000 [package]
name = "deser-hjson"
version = "2.2.4"
authors = ["dystroy "]
repository = "https://github.com/Canop/deser-hjson"
description = "a Hjson deserializer for Serde"
edition = "2018"
keywords = ["hjson", "deserialization", "serde", "derive", "json"]
license = "MIT"
categories = ["encoding"]
readme = "README.md"
[dependencies]
serde = { version = "1.0", features = ["derive"] }
[dev-dependencies]
glassbench = "0.3.5"
[[bench]]
name = "parse"
harness = false
[profile.bench]
lto = true
[profile.release]
lto = true
deser-hjson-2.2.4/LICENSE 0000644 0000000 0000000 00000002046 10461020230 0013061 0 ustar 0000000 0000000 MIT License
Copyright (c) 2020 Canop
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
deser-hjson-2.2.4/README.md 0000644 0000000 0000000 00000006601 10461020230 0013334 0 ustar 0000000 0000000 [![MIT][s2]][l2] [![Latest Version][s1]][l1] [![docs][s3]][l3] [![Chat on Miaou][s4]][l4]
[s1]: https://img.shields.io/crates/v/deser-hjson.svg
[l1]: https://crates.io/crates/deser-hjson
[s2]: https://img.shields.io/badge/license-MIT-blue.svg
[l2]: LICENSE
[s3]: https://docs.rs/deser-hjson/badge.svg
[l3]: https://docs.rs/deser-hjson/
[s4]: https://miaou.dystroy.org/static/shields/room.svg
[l4]: https://miaou.dystroy.org/3768
# deser_hjson
This is a Serde deserializer for [Hjson](https://hjson.github.io/), tailored for derive powered deserialization.
Hjson is a good language for a configuration file.
Such files should be written by a human, read and modified by other humans, then deserialized into a precise structure by a program:
```rust
let file_content = fs::read_to_string(&file_path)?;
let configuration = deser_hjson::from_str(&file_content);
```
If the configuration file is invalid or doesn't match the expected type, the error details the expectation and the error precise location.
## Example
```rust
use {
deser_hjson::*,
serde::Deserialize,
std::collections::HashMap,
};
// This Hjson document comes from https://hjson.github.io/
let hjson = r#"
// use #, // or /**/ for comments,
// omit quotes for keys
key: 1
// omit quotes for strings
contains: everything on this line
// omit commas at the end of a line
cool: {
foo: 1
bar: 2
}
// allow trailing commas
list: [
1,
2,
]
// and use multiline strings
realist:
'''
My half empty glass,
I will fill your empty half.
Now you are half full.
'''
"#;
// we'll deserialize it into this struct:
#[derive(Deserialize, PartialEq, Debug)]
struct Example {
key: i32,
contains: Option,
cool: HashMap,
list: Vec,
realist: String,
missing: Option,
}
let mut cool = HashMap::new();
cool.insert("foo".to_owned(), 1);
cool.insert("bar".to_owned(), 2);
let expected = Example {
key: 1,
contains: Some("everything on this line".to_owned()),
cool,
list: vec![1, 2],
realist: "My half empty glass,\nI will fill your empty half.\nNow you are half full.".to_owned(),
missing: None,
};
// Here's the deserialization and the equality check:
assert_eq!(expected, from_str(hjson).unwrap());
```
## Known open-source usages
* [Broot](https://dystroy.org/broot) can be configured either with TOML or with Hjson (the selection is dynamic, based on the file extension).
* [lemmy](https://github.com/LemmyNet/lemmy) is configured in Hjson
* [Resc](https://github.com/Canop/resc) can be configured either with JSON or with Hjson
## FAQ
### Does it work with JSON ?
Yes as any JSON file can be read as Hjson.
### Why only a derive-based deserializer?
Guessing the types in a format with implicit typing is way too dangereous.
When your user typed `false`, was it a string or a boolean ? When she typed `3`, was it as string or a number ?
While [not as crazy as YAML](https://hitchdev.com/strictyaml/why/implicit-typing-removed/), Hjson has no internal guard for this, and thus should only be deserialized into explicit types.
### Why a deserializer and no serializer?
Hjson isn't a data exchange format. It's intended to be written by humans, be full of comments and with a meaningful formatting.
While serializers would make sense in some context, they would have to be template based, or offer other means to specify comments and formatting, and serde isn't the right tool for that.
deser-hjson-2.2.4/bacon.toml 0000644 0000000 0000000 00000003324 10461020230 0014033 0 ustar 0000000 0000000 # This is a configuration file for the bacon tool
#
# Bacon repository: https://github.com/Canop/bacon
# Complete help on configuration: https://dystroy.org/bacon/config/
default_job = "check"
[jobs.check]
command = ["cargo", "check", "--color", "always"]
need_stdout = false
[jobs.check-all]
command = ["cargo", "check", "--all-targets", "--color", "always"]
need_stdout = false
[jobs.clippy]
command = [
"cargo", "clippy",
"--color", "always",
"--",
"-A", "clippy::vec_init_then_push",
]
need_stdout = false
[jobs.test]
command = [
"cargo", "test", "--color", "always",
"--", "--color", "always", # see https://github.com/Canop/bacon/issues/124
]
need_stdout = true
[jobs.doc]
command = ["cargo", "doc", "--color", "always", "--no-deps"]
need_stdout = false
# If the doc compiles, then it opens in your browser and bacon switches
# to the previous job
[jobs.doc-open]
command = ["cargo", "doc", "--color", "always", "--no-deps", "--open"]
need_stdout = false
on_success = "back" # so that we don't open the browser at each change
# You can run your application and have the result displayed in bacon,
# *if* it makes sense for this crate. You can run an example the same
# way. Don't forget the `--color always` part or the errors won't be
# properly parsed.
# If you want to pass options to your program, a `--` separator
# will be needed.
[jobs.run]
command = [ "cargo", "run", "--color", "always" ]
need_stdout = true
allow_warnings = true
# You may define here keybindings that would be specific to
# a project, for example a shortcut to launch a specific job.
# Shortcuts to internal functions (scrolling, toggling, etc.)
# should go in your personal global prefs.toml file instead.
[keybindings]
# alt-m = "job:my-job"
deser-hjson-2.2.4/benches/parse.rs 0000644 0000000 0000000 00000002525 10461020230 0015145 0 ustar 0000000 0000000 use {
deser_hjson::from_str,
serde:: Deserialize,
glassbench::*,
};
static GIFTS: &[&str] = &[
"{gift:null}",
"{gift:false}",
"{gift: true}",
"{gift:'bar'}",
r#"{gift:"bar"}"#,
"{gift:42}",
"{gift:42457811247}",
"{gift:-42}",
r#"{gift: "abcㅈ"}"#,
"{gift:[15, -50]}",
"{gift:[\"abc\"]}",
r#"{gift:["abc", "another string"]}"#,
r#" {
gift: [
"abc",
"another string"
and a third one (unquoted)
]
}"#,
"{gift:''}",
];
#[derive(Deserialize, PartialEq, Debug)]
#[serde(untagged)]
enum Guess {
Bool(bool),
U8(u8),
I8(i8),
U16(u16),
I16(i16),
U32(u32),
I32(i32),
U64(u64),
I64(i64),
F64(f64),
Char(char),
String(Option),
U16Array(Vec),
I16Array(Vec),
StrArray(Vec),
}
#[derive(Deserialize, PartialEq, Debug)]
struct WrappedGuess {
gift: Guess,
}
fn bench_parse(bench: &mut Bench) {
bench.task("guess wrapped", |task| {
task.iter(|| {
for hjson in GIFTS {
let guessed = from_str::(hjson)
.unwrap_or_else(|e| panic!("Parsing failed for {:?} : {}", hjson, e));
pretend_used(guessed);
}
});
});
}
glassbench!(
"Parse",
bench_parse,
);
deser-hjson-2.2.4/src/de.rs 0000644 0000000 0000000 00000072630 10461020230 0013607 0 ustar 0000000 0000000 //! A Hjson deserializer.
//!
use {
crate::{
de_enum::*,
de_map::*,
de_number::*,
de_seq::*,
error::{
Error,
ErrorCode::{self, *},
Result,
},
utf8::*,
},
serde::de::{self, IntoDeserializer, Visitor},
};
/// The deserializer. You normally don't call it directly
/// but use the `from_str` function available at crate's level.
pub struct Deserializer<'de> {
// the complete string we received
src: &'de str,
// where we're at, in bytes
pos: usize,
// Make it possible to avoid reading a string as a quoteless
// string when a key map is waited for (for example in
// {
// key: value
// }
// ) so that the key doesn't go til the end of the line.
pub(crate) accept_quoteless_value: bool,
}
impl<'de> Deserializer<'de> {
pub fn from_str(src: &'de str) -> Self {
Deserializer {
src,
pos: 0,
accept_quoteless_value: true,
}
}
/// Compute the number of lines and columns to current pos.
/// First line and first col are of index 1.
#[cold]
fn location(&self) -> (usize, usize) {
let (mut line, mut col) = (1, 1);
for ch in self.src[..self.pos].chars() {
if ch == '\n' {
col = 1;
line += 1;
} else {
col += 1;
}
}
(line, col)
}
fn col(&self) -> usize {
let mut p = self.pos;
loop {
if p == 0 {
break;
}
let b = self.src.as_bytes()[p];
if b == b'\r' || b == b'\n' {
break;
}
p -= 1;
}
self.pos - p
}
/// build a syntax error
#[cold]
pub(crate) fn err(&self, code: ErrorCode) -> Error {
let (line, col) = self.location();
// we'll show the next 15 chars in the error message
let at = self.input().chars().take(15).collect();
Error::Syntax {
line,
col,
code,
at,
}
}
/// convert a serde raised error into one with precise location
#[cold]
pub(crate) fn cook_err(&self, err: Error) -> Result {
match err {
Error::RawSerde(message) => {
let (line, col) = self.location();
// we have no real idea where Serde found the problem
// so we write the position but not the characters around
Err(Error::Serde {
line,
col,
message,
})
}
e => Err(e),
}
}
#[cold]
pub(crate) fn fail(&self, code: ErrorCode) -> Result {
Err(self.err(code))
}
/// return an error if there's more than just spaces
/// and comments in the remaining input
pub fn check_all_consumed(&mut self) -> Result<()> {
self.eat_shit().ok();
if self.input().is_empty() {
Ok(())
} else {
self.fail(TrailingCharacters)
}
}
/// what remains to be parsed (including the
/// character we peeked at, if any)
#[inline(always)]
pub(crate) fn input(&self) -> &'de str {
&self.src[self.pos..]
}
/// takes all remaining characters
#[inline(always)]
pub(crate) fn take_all(&mut self) -> &'de str {
let s = &self.src[self.pos..];
self.pos = self.src.len();
s
}
/// return the next code point and its byte size, without advancing the cursor
// adapted from https://doc.rust-lang.org/src/core/str/validations.rs.html
#[inline]
fn peek_code_point(&self) -> Result<(u32, usize)> {
let bytes = self.src.as_bytes();
if self.pos >= bytes.len() {
return self.fail(Eof);
}
// As we start from an already verified UTF8 str, and a valid position,
// we can safely assume the bytes here are consistent with an UTF8 string
let x = bytes[self.pos];
if x < 128 {
return Ok(((x as u32), 1));
}
// Decode from a byte combination out of: [[[x y] z] w]
let init = utf8_first_byte(x, 2);
// SAFETY bytes assumed valid utf8
let y = unsafe { *bytes.get_unchecked(self.pos+1) };
let mut ch = utf8_acc_cont_byte(init, y);
if x >= 0xE0 {
// [[x y z] w] case
// 5th bit in 0xE0 .. 0xEF is always clear, so `init` is still valid
let z = unsafe { *bytes.get_unchecked(self.pos+2) };
let y_z = utf8_acc_cont_byte((y & CONT_MASK) as u32, z);
ch = init << 12 | y_z;
if x >= 0xF0 {
// [x y z w] case
// use only the lower 3 bits of `init`
let w = unsafe { *bytes.get_unchecked(self.pos+3) };
ch = (init & 7) << 18 | utf8_acc_cont_byte(y_z, w);
Ok((ch, 4))
} else {
Ok((ch, 3))
}
} else {
Ok((ch, 2))
}
}
/// return the next byte (or an error on EOF).
/// There's no guarantee the byte is a whole char
#[inline]
pub(crate) fn peek_byte(&self) -> Result {
let bytes = self.src.as_bytes();
if self.pos >= bytes.len() {
self.fail(Eof)
} else {
Ok(bytes[self.pos])
}
}
/// Return the next byte (at position pos). As it advances the cursor,
/// caller MUST throw an error if the byte isn't a valid full character.
#[inline]
pub(crate) fn next_byte(&mut self) -> Result {
let bytes = self.src.as_bytes();
if self.pos >= bytes.len() {
self.fail(Eof)
} else {
let b = bytes[self.pos];
self.pos += 1;
Ok(b)
}
}
/// Look at the first character in the input without consuming it.
#[inline]
pub(crate) fn peek_char(&self) -> Result {
self.peek_code_point()
.map(|(code, _)| unsafe { char::from_u32_unchecked(code) })
}
/// Consume the first character in the input.
#[inline]
pub(crate) fn next_char(&mut self) -> Result {
let (code, len) = self.peek_code_point()?;
self.pos += len;
let ch = unsafe { char::from_u32_unchecked(code) };
Ok(ch)
}
/// read bytes_count bytes of a string.
///
/// The validity of pos + bytes_count as a valid UTF8 position must
/// have been checked before.
#[inline]
pub(crate) fn take_str(&mut self, bytes_count: usize) -> Result<&str> {
if self.src.len() >= self.pos + bytes_count {
let pos = self.pos;
self.pos += bytes_count;
Ok(&self.src[pos..pos + bytes_count])
} else {
self.fail(Eof)
}
}
/// if the next bytes are s, then advance its length and return true
/// otherwise return false.
/// We do a comparison with a &[u8] to avoid the risk of trying read
/// at arbitrary positions and fall between valid UTF8 positions
#[inline]
pub(crate) fn try_read(&mut self, s: &[u8]) -> bool {
#[allow(clippy::collapsible_if)]
if self.src.len() >= self.pos + s.len() {
if &self.src.as_bytes()[self.pos..self.pos + s.len()] == s {
self.pos += s.len();
return true;
}
}
false
}
/// return the `len` first bytes of the input, without checking anything
/// (assuming it has been done) nor consuming anything
#[inline]
pub(crate) fn start(&self, len: usize) -> &'de str {
&self.src[self.pos..self.pos + len]
}
/// remove the next character (which is assumed to be ch)
#[inline]
pub(crate) fn drop(&mut self, ch: char) {
self.advance(ch.len_utf8());
}
/// advance the cursor (assuming bytes_count is consistent with chars)
#[inline]
pub(crate) fn advance(&mut self, bytes_count: usize) {
self.pos += bytes_count;
}
/// tells whether the next tree bytes are `'''` which
/// is the start or end of a multiline string literal in Hjson
#[inline]
fn is_at_triple_quote(&self) -> bool {
self.src.len() >= self.pos + 3
&& &self.src[self.pos..self.pos + 3] == "'''"
}
#[inline]
fn eat_line(&mut self) -> Result<()> {
self.accept_quoteless_value = true;
let bytes = self.src.as_bytes();
unsafe {
for i in self.pos..bytes.len() {
if *bytes.get_unchecked(i) == b'\n' {
self.advance(i - self.pos + 1);
return Ok(());
}
}
}
self.fail(Eof)
}
#[inline]
pub(crate) fn eat_until_star_slash(&mut self) -> Result<()> {
match self.input().find("*/") {
Some(len) => {
self.advance(len + 2);
Ok(())
}
None => self.fail(Eof),
}
}
/// consume spaces, new lines, comments, and stop before
/// first interesting char
#[inline]
pub(crate) fn eat_shit(&mut self) -> Result<()> {
let mut last_is_slash = false;
loop {
match self.peek_byte()? {
b'#' => {
self.eat_line()?;
last_is_slash = false;
}
b'*' => {
if last_is_slash {
self.eat_until_star_slash()?;
} else {
self.advance(1);
}
last_is_slash = false;
}
b'/' => {
if last_is_slash {
self.eat_line()?;
last_is_slash = false;
} else {
self.advance(1);
last_is_slash = true;
}
}
b'\n' => {
self.accept_quoteless_value = true;
self.advance(1);
last_is_slash = false;
}
b' ' | b'\t'| b'\x0C' | b'\r' => { // Hjson whitespaces
self.advance(1);
last_is_slash = false;
}
_ => {
if last_is_slash {
// we don't consume the /: it's the start of a string
self.pos -= 1;
}
return Ok(());
}
}
}
}
pub(crate) fn eat_shit_and(&mut self, mut including: Option) -> Result<()> {
let mut last_is_slash = false;
loop {
let ch = self.peek_char()?;
match ch {
'#' => {
self.eat_line()?;
last_is_slash = false;
}
'*' => {
if last_is_slash {
self.eat_until_star_slash()?;
} else {
self.advance(1);
}
last_is_slash = false;
}
'/' => {
if last_is_slash {
self.eat_line()?;
last_is_slash = false;
} else {
self.advance(1);
last_is_slash = true;
}
}
'\n' => {
self.accept_quoteless_value = true;
self.advance(1);
last_is_slash = false;
}
_ if including == Some(ch) => {
self.drop(ch);
including = None;
last_is_slash = false;
}
_ if ch.is_whitespace() => {
self.drop(ch);
last_is_slash = false;
}
_ => {
if last_is_slash {
self.pos -= 1;
}
return Ok(());
}
}
}
}
/// Parse the JSON identifier `true` or `false`.
fn parse_bool(&mut self) -> Result {
self.eat_shit()?;
if self.try_read(b"true") {
Ok(true)
} else if self.try_read(b"false") {
Ok(false)
} else {
self.fail(ExpectedBoolean)
}
}
/// read the characters of the coming integer, without parsing the
/// resulting string
#[inline]
fn read_integer(&mut self, unsigned: bool) -> Result<&'de str> {
// parsing could be done in the same loop but then I would have
// to handle overflow
self.eat_shit()?;
let bytes = self.src.as_bytes();
for (idx, b) in bytes.iter().skip(self.pos).enumerate() {
match b {
b'-' if unsigned => {
return self.fail(ExpectedPositiveInteger);
}
b'-' if idx > 0 => {
return self.fail(UnexpectedChar);
}
b'0'..=b'9' | b'-' => {
// if it's too long, this will be handled at conversion
}
_ => {
let s = self.start(idx);
self.advance(idx); // we keep the last char
return Ok(s);
}
}
}
Ok(self.take_all())
}
/// read the characters of the coming floating point number, without parsing
#[inline]
fn read_float(&mut self) -> Result<&'de str> {
self.eat_shit()?;
let bytes = &self.src.as_bytes()[self.pos..];
for (idx, b) in bytes.iter().enumerate() {
match b {
b'0'..=b'9' | b'-' | b'+' | b'.' | b'e' | b'E' => {
// if it's invalid, this will be handled at conversion
}
_ => {
let s = self.start(idx);
self.advance(idx); // we keep the last char
return Ok(s);
}
}
}
Ok(self.take_all())
}
/// Parse a string until the next unescaped quote
#[inline]
fn parse_quoted_string(&mut self) -> Result {
let mut s = String::new();
let starting_quote = self.next_char()?;
loop {
let mut c = self.next_char()?;
if c == starting_quote {
break;
} else if c == '\\' {
c = match self.next_byte()? {
b'\"' => '\"',
b'\'' => '\'',
b'\\' => '\\',
b'/' => '/',
b'b' => '\x08', // why did they put this in JSON ?
b'f' => '\x0c', // and this one ?!
b'n' => '\n',
b'r' => '\r',
b't' => '\t',
b'u' => {
self.take_str(4).ok()
.and_then(|s| u32::from_str_radix(s, 16).ok())
.and_then(std::char::from_u32)
.ok_or_else(|| self.err(InvalidEscapeSequence))?
}
_ => {
return self.fail(InvalidEscapeSequence);
}
};
}
s.push(c);
}
Ok(s)
}
/// Parse a string until end of line
fn parse_quoteless_str(&mut self) -> Result<&'de str> {
for (idx, ch) in self.input().char_indices() {
if ch == '\r' || ch == '\n' {
let s = self.start(idx);
self.advance(idx + 1);
return Ok(s.trim_end());
}
}
Ok(self.take_all().trim_end())
}
/// Parse a string until the next triple quote.
fn parse_multiline_string(&mut self) -> Result {
let indent = self.col() - 1;
self.advance(3); // consume the triple quote
// if the multiline string starts on the same line
// than the triple quote, we must ignore the leading
// spaces
loop {
let b = self.peek_byte()?;
match b {
b'\n' => {
self.advance(1);
break;
}
b' ' | b'\t'| b'\x0C' | b'\r' => {
self.advance(1);
}
_ => {
break;
}
}
}
// we then loop on lines
let mut v = String::new();
let mut rem = indent; // the number of leading spaces we remove
while let Ok(ch) = self.next_char() {
match ch {
'\'' if self.src.as_bytes()[self.pos] == b'\'' && self.src.as_bytes()[self.pos+1] == b'\'' => {
self.advance(2); // the 2 other quotes
v.truncate(v.trim_end_matches(|c| c=='\n' || c=='\r').len()); // trimming \n at end
return Ok(v);
}
'\n' => {
v.push(ch);
rem = indent;
}
'\r' => {
// a \r not followed by a \n is probably not
// valid but I'm not sure an error would be
// more useful here than silently ignoring it
}
' ' | '\t'| '\x0C' => {
if rem > 0 {
rem -= 1;
} else {
v.push(ch);
}
}
_ => {
rem = 0;
v.push(ch);
}
}
}
self.fail(Eof) // it's not legal to not have the triple quotes
}
/// Parse an identifier without quotes:
/// - map key
/// - enum variant
fn parse_quoteless_identifier(&mut self) -> Result<&'de str> {
self.eat_shit()?;
for (idx, ch) in self.input().char_indices() {
match ch {
',' | '[' | ']' | '{' | '}' | ':' | '\r'| '\n' => {
let s = self.start(idx);
self.advance(idx);
return Ok(s);
}
' ' | '\t' => {
let s = self.start(idx);
self.advance(idx + 1);
return Ok(s);
}
_ => {}
}
}
Ok(self.take_all())
}
/// parse a string which may be a value
/// (i.e. not an map key or variant identifier )
fn parse_string_value(&mut self) -> Result {
self.eat_shit()?;
let b = self.peek_byte()?;
let v = match b {
b',' | b':' | b'[' | b']' | b'{' | b'}' => self.fail(UnexpectedChar),
b'\'' if self.is_at_triple_quote() => self.parse_multiline_string(),
b'"' | b'\'' => self.parse_quoted_string(),
_ => (if self.accept_quoteless_value {
self.parse_quoteless_str()
} else {
self.parse_quoteless_identifier()
})
.map(|s| s.to_string()),
};
self.accept_quoteless_value = true;
v
}
#[inline]
fn parse_identifier(&mut self) -> Result {
self.eat_shit()?;
let b = self.peek_byte()?;
// we set accept_quoteless_value to true so that a quoteless
// string can be accepted *after* the current identifier
self.accept_quoteless_value = true;
let r = match b {
b',' | b':' | b'[' | b']' | b'{' | b'}' => self.fail(UnexpectedChar),
b'"' | b'\'' => self.parse_quoted_string(),
_ => self.parse_quoteless_identifier().map(|s| s.to_string())
};
r
}
/// Braceless Hjson: same than usual but not within { and },
/// can only be for the whole document
fn deserialize_braceless_map(&mut self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let mut map_reader = MapReader::braceless(self);
map_reader.braceless = true;
let value = match visitor.visit_map(map_reader) {
Ok(v) => v,
Err(e) => {
return self.cook_err(e);
}
};
Ok(value)
}
}
impl<'de, 'a> de::Deserializer<'de> for &'a mut Deserializer<'de> {
type Error = Error;
fn deserialize_any(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.eat_shit()?;
match self.peek_byte()? {
b'"' | b'\'' => self.deserialize_string(visitor),
b'0'..=b'9' | b'-' => {
let number = Number::read(self)?;
number.visit(self, visitor)
}
b'[' => self.deserialize_seq(visitor),
b'{' => self.deserialize_map(visitor),
_ => {
if self.try_read(b"null") {
return visitor.visit_none();
}
if self.try_read(b"true") {
return visitor.visit_bool(true);
}
if self.try_read(b"false") {
return visitor.visit_bool(false);
}
let s = self.parse_string_value()?;
visitor.visit_string(s)
}
}
}
fn deserialize_bool(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
visitor.visit_bool(self.parse_bool()?)
}
fn deserialize_i8(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_integer(false)
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedI8)))?;
visitor.visit_i8(v)
}
fn deserialize_i16(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_integer(false)
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedI16)))?;
visitor.visit_i16(v)
}
fn deserialize_i32(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_integer(false)
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedI32)))?;
visitor.visit_i32(v)
}
fn deserialize_i64(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_integer(false)
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedI64)))?;
visitor.visit_i64(v)
}
fn deserialize_u8(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_integer(true)
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedU8)))?;
visitor.visit_u8(v)
}
fn deserialize_u16(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_integer(true)
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedU16)))?;
visitor.visit_u16(v)
}
fn deserialize_u32(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_integer(true)
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedU32)))?;
visitor.visit_u32(v)
}
fn deserialize_u64(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_integer(true)
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedU64)))?;
visitor.visit_u64(v)
}
fn deserialize_f32(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_float()
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedF32)))?;
visitor.visit_f32(v)
}
fn deserialize_f64(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let v = self
.read_float()
.and_then(|s| s.parse().map_err(|_| self.err(ExpectedF64)))?;
visitor.visit_f64(v)
}
fn deserialize_char(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let c = self
.parse_string_value()
.and_then(|s| s.chars().next().ok_or_else(|| self.err(ExpectedSingleChar)))?;
visitor.visit_char(c)
}
fn deserialize_str(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
// we can't always borrow strs from the source as it's not possible
// when there's an escape sequence. So str are parsed as strings.
self.deserialize_string(visitor)
}
fn deserialize_string(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
visitor.visit_string(self.parse_string_value()?)
}
fn deserialize_bytes(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.deserialize_seq(visitor)
}
fn deserialize_byte_buf(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.deserialize_seq(visitor)
}
fn deserialize_option(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.eat_shit()?;
if self.try_read(b"null") {
visitor.visit_none()
} else {
visitor.visit_some(self)
}
}
// In Serde, unit means an anonymous value containing no data.
fn deserialize_unit(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.eat_shit()?;
if self.try_read(b"null") {
visitor.visit_unit()
} else {
self.fail(ExpectedNull)
}
}
// Unit struct means a named value containing no data.
fn deserialize_unit_struct(self, _name: &'static str, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.deserialize_unit(visitor)
}
fn deserialize_newtype_struct(self, _name: &'static str, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.eat_shit()?;
visitor.visit_newtype_struct(self)
}
fn deserialize_seq(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.eat_shit()?;
if self.next_byte()? == b'[' {
let value = visitor.visit_seq(SeqReader::new(self))?;
if self.next_byte()? == b']' {
Ok(value)
} else {
self.fail(ExpectedArrayEnd)
}
} else {
self.fail(ExpectedArray)
}
}
fn deserialize_tuple(self, _len: usize, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.deserialize_seq(visitor)
}
fn deserialize_tuple_struct(
self,
_name: &'static str,
_len: usize,
visitor: V,
) -> Result
where
V: Visitor<'de>,
{
self.deserialize_seq(visitor)
}
fn deserialize_map(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
let on_start = self.pos == 0;
if let Err(e) = self.eat_shit() {
if on_start && e.is_eof() {
return self.deserialize_braceless_map(visitor);
} else {
return Err(e);
}
}
if self.peek_byte()? == b'{' {
self.advance(1);
let value = match visitor.visit_map(MapReader::within_braces(self)) {
Ok(v) => v,
Err(e) => {
return self.cook_err(e);
}
};
self.eat_shit()?;
if self.next_byte()? == b'}' {
Ok(value)
} else {
self.fail(ExpectedMapEnd)
}
} else if on_start {
self.deserialize_braceless_map(visitor)
} else {
self.fail(ExpectedMap)
}
}
fn deserialize_struct(
self,
_name: &'static str,
_fields: &'static [&'static str],
visitor: V,
) -> Result
where
V: Visitor<'de>,
{
self.deserialize_map(visitor)
}
fn deserialize_enum(
self,
_name: &'static str,
_variants: &'static [&'static str],
visitor: V,
) -> Result
where
V: Visitor<'de>,
{
self.eat_shit()?;
match self.peek_byte()? {
b'"' | b'\'' => {
// Visit a unit variant.
visitor.visit_enum(self.parse_quoted_string()?.into_deserializer())
}
b'{' => {
self.advance(1);
// Visit a newtype variant, tuple variant, or struct variant.
let value = visitor.visit_enum(EnumReader::new(self))?;
self.eat_shit()?;
if self.next_byte()? == b'}' {
Ok(value)
} else {
self.fail(ExpectedMapEnd)
}
}
_ => {
visitor.visit_enum(self.parse_quoteless_identifier()?.into_deserializer())
}
}
}
fn deserialize_identifier(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
visitor.visit_string(self.parse_identifier()?)
}
fn deserialize_ignored_any(self, visitor: V) -> Result
where
V: Visitor<'de>,
{
self.deserialize_any(visitor)
}
}
deser-hjson-2.2.4/src/de_enum.rs 0000644 0000000 0000000 00000005141 10461020230 0014624 0 ustar 0000000 0000000 use {
crate::{
de::Deserializer,
error::{Error, ErrorCode::*, Result},
},
serde::de::{self, DeserializeSeed, EnumAccess, VariantAccess, Visitor},
};
pub struct EnumReader<'a, 'de: 'a> {
de: &'a mut Deserializer<'de>,
}
impl<'a, 'de> EnumReader<'a, 'de> {
pub fn new(de: &'a mut Deserializer<'de>) -> Self {
EnumReader { de }
}
}
// `EnumAccess` is provided to the `Visitor` to give it the ability to determine
// which variant of the enum is supposed to be deserialized.
//
// Note that all enum deserialization methods in Serde refer exclusively to the
// "externally tagged" enum representation.
impl<'de, 'a> EnumAccess<'de> for EnumReader<'a, 'de> {
type Error = Error;
type Variant = Self;
fn variant_seed(self, seed: V) -> Result<(V::Value, Self::Variant)>
where
V: DeserializeSeed<'de>,
{
// The `deserialize_enum` method parsed a `{` character so we are
// currently inside of a map. The seed will be deserializing itself from
// the key of the map.
let val = seed.deserialize(&mut *self.de)?;
self.de.eat_shit()?;
if self.de.next_byte()? == b':' {
Ok((val, self))
} else {
self.de.fail(ExpectedMapColon)
}
}
}
// `VariantAccess` is provided to the `Visitor` to give it the ability to see
// the content of the single variant that it decided to deserialize.
impl<'de, 'a> VariantAccess<'de> for EnumReader<'a, 'de> {
type Error = Error;
// If the `Visitor` expected this variant to be a unit variant, the input
// should have been the plain string case handled in `deserialize_enum`.
fn unit_variant(self) -> Result<()> {
self.de.fail(ExpectedString)
}
// Newtype variants are represented in JSON as `{ NAME: VALUE }` so
// deserialize the value here.
fn newtype_variant_seed(self, seed: T) -> Result
where
T: DeserializeSeed<'de>,
{
seed.deserialize(self.de)
}
// Tuple variants are represented in JSON as `{ NAME: [DATA...] }` so
// deserialize the sequence of data here.
fn tuple_variant(self, _len: usize, visitor: V) -> Result
where
V: Visitor<'de>,
{
de::Deserializer::deserialize_seq(self.de, visitor)
}
// Struct variants are represented in JSON as `{ NAME: { K: V, ... } }` so
// deserialize the inner map here.
fn struct_variant(self, _fields: &'static [&'static str], visitor: V) -> Result
where
V: Visitor<'de>,
{
de::Deserializer::deserialize_map(self.de, visitor)
}
}
deser-hjson-2.2.4/src/de_map.rs 0000644 0000000 0000000 00000005000 10461020230 0014427 0 ustar 0000000 0000000 use {
crate::{
de::Deserializer,
error::{Error, ErrorCode::*, Result},
},
serde::de::{DeserializeSeed, MapAccess},
};
pub struct MapReader<'a, 'de: 'a> {
de: &'a mut Deserializer<'de>,
/// if braceless is true, the map may be closed by an eof instead of a '}'
pub braceless: bool,
}
impl<'a, 'de> MapReader<'a, 'de> {
pub fn braceless(de: &'a mut Deserializer<'de>) -> Self {
MapReader { de, braceless: true }
}
pub fn within_braces(de: &'a mut Deserializer<'de>) -> Self {
MapReader { de, braceless: false }
}
}
// `MapAccess` is provided to the `Visitor` to give it the ability to iterate
// through entries of the map.
impl<'de, 'a> MapAccess<'de> for MapReader<'a, 'de> {
type Error = Error;
/// read a map key and the following colon
fn next_key_seed(&mut self, seed: K) -> Result