pest-2.1.2/_README.md 0100644 0000765 0000024 00000012521 13533150510 0012321 0 ustar 00 0000000 0000000
# pest. The Elegant Parser
[](https://gitter.im/dragostis/pest?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[](https://pest-parser.github.io/book)
[](https://docs.rs/pest)
[](https://travis-ci.org/pest-parser/pest)
[](https://codecov.io/gh/pest-parser/pest)
[](https://crates.io/crates/pest)
[](https://crates.io/crates/pest)
pest is a general purpose parser written in Rust with a focus on accessibility,
correctness, and performance. It uses parsing expression grammars
(or [PEG]) as input, which are similar in spirit to regular expressions, but
which offer the enhanced expressivity needed to parse complex languages.
[PEG]: https://en.wikipedia.org/wiki/Parsing_expression_grammar
## Getting started
The recommended way to start parsing with pest is to read the official [book].
Other helpful resources:
* API reference on [docs.rs]
* play with grammars and share them on our [fiddle]
* leave feedback, ask questions, or greet us on [Gitter]
[book]: https://pest-parser.github.io/book
[docs.rs]: https://docs.rs/pest
[fiddle]: https://pest-parser.github.io/#editor
[Gitter]: https://gitter.im/dragostis/pest
## Example
The following is an example of a grammar for a list of alpha-numeric identifiers
where the first identifier does not start with a digit:
```rust
alpha = { 'a'..'z' | 'A'..'Z' }
digit = { '0'..'9' }
ident = { (alpha | digit)+ }
ident_list = _{ !digit ~ ident ~ (" " ~ ident)+ }
// ^
// ident_list rule is silent which means it produces no tokens
```
Grammars are saved in separate .pest files which are never mixed with procedural
code. This results in an always up-to-date formalization of a language that is
easy to read and maintain.
## Meaningful error reporting
Based on the grammar definition, the parser also includes automatic error
reporting. For the example above, the input `"123"` will result in:
```
thread 'main' panicked at ' --> 1:1
|
1 | 123
| ^---
|
= unexpected digit', src/main.rs:12
```
while `"ab *"` will result in:
```
thread 'main' panicked at ' --> 1:1
|
1 | ab *
| ^---
|
= expected ident', src/main.rs:12
```
## Pairs API
The grammar can be used to derive a `Parser` implementation automatically.
Parsing returns an iterator of nested token pairs:
```rust
extern crate pest;
#[macro_use]
extern crate pest_derive;
use pest::Parser;
#[derive(Parser)]
#[grammar = "ident.pest"]
struct IdentParser;
fn main() {
let pairs = IdentParser::parse(Rule::ident_list, "a1 b2").unwrap_or_else(|e| panic!("{}", e));
// Because ident_list is silent, the iterator will contain idents
for pair in pairs {
// A pair is a combination of the rule which matched and a span of input
println!("Rule: {:?}", pair.as_rule());
println!("Span: {:?}", pair.as_span());
println!("Text: {}", pair.as_str());
// A pair can be converted to an iterator of the tokens which make it up:
for inner_pair in pair.into_inner() {
match inner_pair.as_rule() {
Rule::alpha => println!("Letter: {}", inner_pair.as_str()),
Rule::digit => println!("Digit: {}", inner_pair.as_str()),
_ => unreachable!()
};
}
}
}
```
This produces the following output:
```
Rule: ident
Span: Span { start: 0, end: 2 }
Text: a1
Letter: a
Digit: 1
Rule: ident
Span: Span { start: 3, end: 5 }
Text: b2
Letter: b
Digit: 2
```
## Other features
* Precedence climbing
* Input handling
* Custom errors
* Runs on stable Rust
## Projects using pest
* [pest_meta](https://github.com/pest-parser/pest/blob/master/meta/src/grammar.pest) (bootstrapped)
* [brain](https://github.com/brain-lang/brain)
* [Chelone](https://github.com/Aaronepower/chelone)
* [comrak](https://github.com/kivikakk/comrak)
* [elastic-rs](https://github.com/cch123/elastic-rs)
* [graphql-parser](https://github.com/Keats/graphql-parser)
* [handlebars-rust](https://github.com/sunng87/handlebars-rust)
* [hexdino](https://github.com/Luz/hexdino)
* [Huia](https://gitlab.com/jimsy/huia/)
* [jql](https://github.com/yamafaktory/jql)
* [json5-rs](https://github.com/callum-oakley/json5-rs)
* [mt940](https://github.com/svenstaro/mt940-rs)
* [py_literal](https://github.com/jturner314/py_literal)
* [rouler](https://github.com/jarcane/rouler)
* [RuSh](https://github.com/lwandrebeck/RuSh)
* [rs_pbrt](https://github.com/wahn/rs_pbrt)
* [stache](https://github.com/dgraham/stache)
* [tera](https://github.com/Keats/tera)
* [ui_gen](https://github.com/emoon/ui_gen)
* [ukhasnet-parser](https://github.com/adamgreig/ukhasnet-parser)
* [ZoKrates](https://github.com/ZoKrates/ZoKrates)
## Special thanks
A special round of applause goes to prof. Marius Minea for his guidance and all
pest contributors, some of which being none other than my friends.
pest-2.1.2/Cargo.toml.orig 0100644 0000765 0000024 00000001421 13533150772 0013601 0 ustar 00 0000000 0000000 [package]
name = "pest"
description = "The Elegant Parser"
version = "2.1.2"
authors = ["Dragoș Tiselice "]
homepage = "https://pest-parser.github.io/"
repository = "https://github.com/pest-parser/pest"
documentation = "https://docs.rs/pest"
keywords = ["pest", "parser", "peg", "grammar"]
categories = ["parsing"]
license = "MIT/Apache-2.0"
readme = "_README.md"
[features]
# Enables the `to_json` function for `Pair` and `Pairs`
pretty-print = ["serde", "serde_json"]
[dependencies]
ucd-trie = "0.1.1"
serde = { version = "1.0.89", optional = true }
serde_json = { version = "1.0.39", optional = true}
[badges]
codecov = { repository = "pest-parser/pest" }
maintenance = { status = "actively-developed" }
travis-ci = { repository = "pest-parser/pest" }
pest-2.1.2/Cargo.toml 0000644 00000002400 00000000000 0010042 0 ustar 00 # THIS FILE IS AUTOMATICALLY GENERATED BY CARGO
#
# When uploading crates to the registry Cargo will automatically
# "normalize" Cargo.toml files for maximal compatibility
# with all versions of Cargo and also rewrite `path` dependencies
# to registry (e.g. crates.io) dependencies
#
# If you believe there's an error in this file please file an
# issue against the rust-lang/cargo repository. If you're
# editing this file be aware that the upstream Cargo.toml
# will likely look very different (and much more reasonable)
[package]
name = "pest"
version = "2.1.2"
authors = ["Dragoș Tiselice "]
description = "The Elegant Parser"
homepage = "https://pest-parser.github.io/"
documentation = "https://docs.rs/pest"
readme = "_README.md"
keywords = ["pest", "parser", "peg", "grammar"]
categories = ["parsing"]
license = "MIT/Apache-2.0"
repository = "https://github.com/pest-parser/pest"
[dependencies.serde]
version = "1.0.89"
optional = true
[dependencies.serde_json]
version = "1.0.39"
optional = true
[dependencies.ucd-trie]
version = "0.1.1"
[features]
pretty-print = ["serde", "serde_json"]
[badges.codecov]
repository = "pest-parser/pest"
[badges.maintenance]
status = "actively-developed"
[badges.travis-ci]
repository = "pest-parser/pest"
pest-2.1.2/Cargo.toml.orig 0000644 00000002401 00000000000 0011002 0 ustar 00 # THIS FILE IS AUTOMATICALLY GENERATED BY CARGO
#
# When uploading crates to the registry Cargo will automatically
# "normalize" Cargo.toml files for maximal compatibility
# with all versions of Cargo and also rewrite `path` dependencies
# to registry (e.g., crates.io) dependencies
#
# If you believe there's an error in this file please file an
# issue against the rust-lang/cargo repository. If you're
# editing this file be aware that the upstream Cargo.toml
# will likely look very different (and much more reasonable)
[package]
name = "pest"
version = "2.1.2"
authors = ["Dragoș Tiselice "]
description = "The Elegant Parser"
homepage = "https://pest-parser.github.io/"
documentation = "https://docs.rs/pest"
readme = "_README.md"
keywords = ["pest", "parser", "peg", "grammar"]
categories = ["parsing"]
license = "MIT/Apache-2.0"
repository = "https://github.com/pest-parser/pest"
[dependencies.serde]
version = "1.0.89"
optional = true
[dependencies.serde_json]
version = "1.0.39"
optional = true
[dependencies.ucd-trie]
version = "0.1.1"
[features]
pretty-print = ["serde", "serde_json"]
[badges.codecov]
repository = "pest-parser/pest"
[badges.maintenance]
status = "actively-developed"
[badges.travis-ci]
repository = "pest-parser/pest"
pest-2.1.2/examples/parens.rs 0100644 0000765 0000024 00000004100 13407230676 0014365 0 ustar 00 0000000 0000000 extern crate pest;
use std::io::{self, Write};
use pest::error::Error;
use pest::iterators::Pairs;
use pest::{state, ParseResult, Parser, ParserState};
#[allow(dead_code, non_camel_case_types)]
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
enum Rule {
expr,
paren,
paren_end,
}
struct ParenParser;
impl Parser for ParenParser {
fn parse(rule: Rule, input: &str) -> Result, Error> {
fn expr(state: Box>) -> ParseResult>> {
state.sequence(|s| s.repeat(|s| paren(s)).and_then(|s| s.end_of_input()))
}
fn paren(state: Box>) -> ParseResult>> {
state.rule(Rule::paren, |s| {
s.sequence(|s| {
s.match_string("(")
.and_then(|s| {
s.optional(|s| {
s.sequence(|s| {
s.lookahead(true, |s| s.match_string("("))
.and_then(|s| s.repeat(|s| paren(s)))
})
})
})
.and_then(|s| s.rule(Rule::paren_end, |s| s.match_string(")")))
})
})
}
state(input, |state| match rule {
Rule::expr => expr(state),
Rule::paren => paren(state),
_ => unreachable!(),
})
}
}
#[derive(Debug)]
struct Paren(Vec);
fn expr(pairs: Pairs) -> Vec {
pairs
.filter(|p| p.as_rule() == Rule::paren)
.map(|p| Paren(expr(p.into_inner())))
.collect()
}
fn main() {
loop {
let mut line = String::new();
print!("> ");
io::stdout().flush().unwrap();
io::stdin().read_line(&mut line).unwrap();
line.pop();
match ParenParser::parse(Rule::expr, &line) {
Ok(pairs) => println!("{:?}", expr(pairs)),
Err(e) => println!("\n{}", e),
};
}
}
pest-2.1.2/LICENSE-APACHE 0100644 0000765 0000024 00000025137 13327737546 0012663 0 ustar 00 0000000 0000000 Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
pest-2.1.2/LICENSE-MIT 0100644 0000765 0000024 00000001777 13327737546 0012377 0 ustar 00 0000000 0000000 Permission is hereby granted, free of charge, to any
person obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the
Software without restriction, including without
limitation the rights to use, copy, modify, merge,
publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software
is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice
shall be included in all copies or substantial portions
of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF
ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
pest-2.1.2/src/error.rs 0100644 0000765 0000024 00000055766 13533150510 0013213 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
//! Types for different kinds of parsing failures.
use std::cmp;
use std::error;
use std::fmt;
use std::mem;
use position::Position;
use span::Span;
use RuleType;
/// Parse-related error type.
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub struct Error {
/// Variant of the error
pub variant: ErrorVariant,
/// Location within the input string
pub location: InputLocation,
/// Line/column within the input string
pub line_col: LineColLocation,
path: Option,
line: String,
continued_line: Option,
}
/// Different kinds of parsing errors.
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub enum ErrorVariant {
/// Generated parsing error with expected and unexpected `Rule`s
ParsingError {
/// Positive attempts
positives: Vec,
/// Negative attempts
negatives: Vec,
},
/// Custom error with a message
CustomError {
/// Short explanation
message: String,
},
}
/// Where an `Error` has occurred.
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub enum InputLocation {
/// `Error` was created by `Error::new_from_pos`
Pos(usize),
/// `Error` was created by `Error::new_from_span`
Span((usize, usize)),
}
/// Line/column where an `Error` has occurred.
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub enum LineColLocation {
/// Line/column pair if `Error` was created by `Error::new_from_pos`
Pos((usize, usize)),
/// Line/column pairs if `Error` was created by `Error::new_from_span`
Span((usize, usize), (usize, usize)),
}
impl Error {
/// Creates `Error` from `ErrorVariant` and `Position`.
///
/// # Examples
///
/// ```
/// # use pest::error::{Error, ErrorVariant};
/// # use pest::Position;
/// # #[allow(non_camel_case_types)]
/// # #[allow(dead_code)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// # enum Rule {
/// # open_paren,
/// # closed_paren
/// # }
/// # let input = "";
/// # let pos = Position::from_start(input);
/// let error = Error::new_from_pos(
/// ErrorVariant::ParsingError {
/// positives: vec![Rule::open_paren],
/// negatives: vec![Rule::closed_paren]
/// },
/// pos
/// );
///
/// println!("{}", error);
/// ```
#[allow(clippy::needless_pass_by_value)]
pub fn new_from_pos(variant: ErrorVariant, pos: Position) -> Error {
Error {
variant,
location: InputLocation::Pos(pos.pos()),
path: None,
line: visualize_whitespace(pos.line_of()),
continued_line: None,
line_col: LineColLocation::Pos(pos.line_col()),
}
}
/// Creates `Error` from `ErrorVariant` and `Span`.
///
/// # Examples
///
/// ```
/// # use pest::error::{Error, ErrorVariant};
/// # use pest::{Position, Span};
/// # #[allow(non_camel_case_types)]
/// # #[allow(dead_code)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// # enum Rule {
/// # open_paren,
/// # closed_paren
/// # }
/// # let input = "";
/// # let start = Position::from_start(input);
/// # let end = start.clone();
/// # let span = start.span(&end);
/// let error = Error::new_from_span(
/// ErrorVariant::ParsingError {
/// positives: vec![Rule::open_paren],
/// negatives: vec![Rule::closed_paren]
/// },
/// span
/// );
///
/// println!("{}", error);
/// ```
#[allow(clippy::needless_pass_by_value)]
pub fn new_from_span(variant: ErrorVariant, span: Span) -> Error {
let end = span.end_pos();
let mut end_line_col = end.line_col();
// end position is after a \n, so we want to point to the visual lf symbol
if end_line_col.1 == 1 {
let mut visual_end = end.clone();
visual_end.skip_back(1);
let lc = visual_end.line_col();
end_line_col = (lc.0, lc.1 + 1);
};
let mut line_iter = span.lines();
let start_line = visualize_whitespace(line_iter.next().unwrap_or(""));
let continued_line = line_iter.last().map(visualize_whitespace);
Error {
variant,
location: InputLocation::Span((span.start(), end.pos())),
path: None,
line: start_line,
continued_line,
line_col: LineColLocation::Span(span.start_pos().line_col(), end_line_col),
}
}
/// Returns `Error` variant with `path` which is shown when formatted with `Display`.
///
/// # Examples
///
/// ```
/// # use pest::error::{Error, ErrorVariant};
/// # use pest::Position;
/// # #[allow(non_camel_case_types)]
/// # #[allow(dead_code)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// # enum Rule {
/// # open_paren,
/// # closed_paren
/// # }
/// # let input = "";
/// # let pos = Position::from_start(input);
/// Error::new_from_pos(
/// ErrorVariant::ParsingError {
/// positives: vec![Rule::open_paren],
/// negatives: vec![Rule::closed_paren]
/// },
/// pos
/// ).with_path("file.rs");
/// ```
pub fn with_path(mut self, path: &str) -> Error {
self.path = Some(path.to_owned());
self
}
/// Renames all `Rule`s if this is a [`ParsingError`]. It does nothing when called on a
/// [`CustomError`].
///
/// Useful in order to rename verbose rules or have detailed per-`Rule` formatting.
///
/// [`ParsingError`]: enum.ErrorVariant.html#variant.ParsingError
/// [`CustomError`]: enum.ErrorVariant.html#variant.CustomError
///
/// # Examples
///
/// ```
/// # use pest::error::{Error, ErrorVariant};
/// # use pest::Position;
/// # #[allow(non_camel_case_types)]
/// # #[allow(dead_code)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// # enum Rule {
/// # open_paren,
/// # closed_paren
/// # }
/// # let input = "";
/// # let pos = Position::from_start(input);
/// Error::new_from_pos(
/// ErrorVariant::ParsingError {
/// positives: vec![Rule::open_paren],
/// negatives: vec![Rule::closed_paren]
/// },
/// pos
/// ).renamed_rules(|rule| {
/// match *rule {
/// Rule::open_paren => "(".to_owned(),
/// Rule::closed_paren => "closed paren".to_owned()
/// }
/// });
/// ```
pub fn renamed_rules(mut self, f: F) -> Error
where
F: FnMut(&R) -> String,
{
let variant = match self.variant {
ErrorVariant::ParsingError {
positives,
negatives,
} => {
let message = Error::parsing_error_message(&positives, &negatives, f);
ErrorVariant::CustomError { message }
}
variant => variant,
};
self.variant = variant;
self
}
fn start(&self) -> (usize, usize) {
match self.line_col {
LineColLocation::Pos(line_col) => line_col,
LineColLocation::Span(start_line_col, _) => start_line_col,
}
}
fn spacing(&self) -> String {
let line = match self.line_col {
LineColLocation::Pos((line, _)) => line,
LineColLocation::Span((start_line, _), (end_line, _)) => cmp::max(start_line, end_line),
};
let line_str_len = format!("{}", line).len();
let mut spacing = String::new();
for _ in 0..line_str_len {
spacing.push(' ');
}
spacing
}
fn underline(&self) -> String {
let mut underline = String::new();
let mut start = self.start().1;
let end = match self.line_col {
LineColLocation::Span(_, (_, mut end)) => {
let inverted_cols = start > end;
if inverted_cols {
mem::swap(&mut start, &mut end);
start -= 1;
end += 1;
}
Some(end)
}
_ => None,
};
let offset = start - 1;
let line_chars = self.line.chars();
for c in line_chars.take(offset) {
match c {
'\t' => underline.push('\t'),
_ => underline.push(' '),
}
}
if let Some(end) = end {
if end - start > 1 {
underline.push('^');
for _ in 2..(end - start) {
underline.push('-');
}
underline.push('^');
} else {
underline.push('^');
}
} else {
underline.push_str("^---")
}
underline
}
fn message(&self) -> String {
match self.variant {
ErrorVariant::ParsingError {
ref positives,
ref negatives,
} => Error::parsing_error_message(positives, negatives, |r| format!("{:?}", r)),
ErrorVariant::CustomError { ref message } => message.clone(),
}
}
fn parsing_error_message(positives: &[R], negatives: &[R], mut f: F) -> String
where
F: FnMut(&R) -> String,
{
match (negatives.is_empty(), positives.is_empty()) {
(false, false) => format!(
"unexpected {}; expected {}",
Error::enumerate(negatives, &mut f),
Error::enumerate(positives, &mut f)
),
(false, true) => format!("unexpected {}", Error::enumerate(negatives, &mut f)),
(true, false) => format!("expected {}", Error::enumerate(positives, &mut f)),
(true, true) => "unknown parsing error".to_owned(),
}
}
fn enumerate(rules: &[R], f: &mut F) -> String
where
F: FnMut(&R) -> String,
{
match rules.len() {
1 => f(&rules[0]),
2 => format!("{} or {}", f(&rules[0]), f(&rules[1])),
l => {
let separated = rules
.iter()
.take(l - 1)
.map(|r| f(r))
.collect::>()
.join(", ");
format!("{}, or {}", separated, f(&rules[l - 1]))
}
}
}
pub(crate) fn format(&self) -> String {
let spacing = self.spacing();
let path = self
.path
.as_ref()
.map(|path| format!("{}:", path))
.unwrap_or_default();
let pair = (self.line_col.clone(), &self.continued_line);
if let (LineColLocation::Span(_, end), &Some(ref continued_line)) = pair {
let has_line_gap = end.0 - self.start().0 > 1;
if has_line_gap {
format!(
"{s }--> {p}{ls}:{c}\n\
{s } |\n\
{ls:w$} | {line}\n\
{s } | ...\n\
{le:w$} | {continued_line}\n\
{s } | {underline}\n\
{s } |\n\
{s } = {message}",
s = spacing,
w = spacing.len(),
p = path,
ls = self.start().0,
le = end.0,
c = self.start().1,
line = self.line,
continued_line = continued_line,
underline = self.underline(),
message = self.message()
)
} else {
format!(
"{s }--> {p}{ls}:{c}\n\
{s } |\n\
{ls:w$} | {line}\n\
{le:w$} | {continued_line}\n\
{s } | {underline}\n\
{s } |\n\
{s } = {message}",
s = spacing,
w = spacing.len(),
p = path,
ls = self.start().0,
le = end.0,
c = self.start().1,
line = self.line,
continued_line = continued_line,
underline = self.underline(),
message = self.message()
)
}
} else {
format!(
"{s}--> {p}{l}:{c}\n\
{s} |\n\
{l} | {line}\n\
{s} | {underline}\n\
{s} |\n\
{s} = {message}",
s = spacing,
p = path,
l = self.start().0,
c = self.start().1,
line = self.line,
underline = self.underline(),
message = self.message()
)
}
}
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{}", self.format())
}
}
impl<'i, R: RuleType> error::Error for Error {
fn description(&self) -> &str {
match self.variant {
ErrorVariant::ParsingError { .. } => "parsing error",
ErrorVariant::CustomError { ref message } => message,
}
}
}
fn visualize_whitespace(input: &str) -> String {
input.to_owned().replace('\r', "␍").replace('\n', "␊")
}
#[cfg(test)]
mod tests {
use super::super::position;
use super::*;
#[test]
fn display_parsing_error_mixed() {
let input = "ab\ncd\nef";
let pos = position::Position::new(input, 4).unwrap();
let error: Error = Error::new_from_pos(
ErrorVariant::ParsingError {
positives: vec![1, 2, 3],
negatives: vec![4, 5, 6],
},
pos,
);
assert_eq!(
format!("{}", error),
vec![
" --> 2:2",
" |",
"2 | cd␊",
" | ^---",
" |",
" = unexpected 4, 5, or 6; expected 1, 2, or 3",
]
.join("\n")
);
}
#[test]
fn display_parsing_error_positives() {
let input = "ab\ncd\nef";
let pos = position::Position::new(input, 4).unwrap();
let error: Error = Error::new_from_pos(
ErrorVariant::ParsingError {
positives: vec![1, 2],
negatives: vec![],
},
pos,
);
assert_eq!(
format!("{}", error),
vec![
" --> 2:2",
" |",
"2 | cd␊",
" | ^---",
" |",
" = expected 1 or 2",
]
.join("\n")
);
}
#[test]
fn display_parsing_error_negatives() {
let input = "ab\ncd\nef";
let pos = position::Position::new(input, 4).unwrap();
let error: Error = Error::new_from_pos(
ErrorVariant::ParsingError {
positives: vec![],
negatives: vec![4, 5, 6],
},
pos,
);
assert_eq!(
format!("{}", error),
vec![
" --> 2:2",
" |",
"2 | cd␊",
" | ^---",
" |",
" = unexpected 4, 5, or 6",
]
.join("\n")
);
}
#[test]
fn display_parsing_error_unknown() {
let input = "ab\ncd\nef";
let pos = position::Position::new(input, 4).unwrap();
let error: Error = Error::new_from_pos(
ErrorVariant::ParsingError {
positives: vec![],
negatives: vec![],
},
pos,
);
assert_eq!(
format!("{}", error),
vec![
" --> 2:2",
" |",
"2 | cd␊",
" | ^---",
" |",
" = unknown parsing error",
]
.join("\n")
);
}
#[test]
fn display_custom_pos() {
let input = "ab\ncd\nef";
let pos = position::Position::new(input, 4).unwrap();
let error: Error = Error::new_from_pos(
ErrorVariant::CustomError {
message: "error: big one".to_owned(),
},
pos,
);
assert_eq!(
format!("{}", error),
vec![
" --> 2:2",
" |",
"2 | cd␊",
" | ^---",
" |",
" = error: big one",
]
.join("\n")
);
}
#[test]
fn display_custom_span_two_lines() {
let input = "ab\ncd\nefgh";
let start = position::Position::new(input, 4).unwrap();
let end = position::Position::new(input, 9).unwrap();
let error: Error = Error::new_from_span(
ErrorVariant::CustomError {
message: "error: big one".to_owned(),
},
start.span(&end),
);
assert_eq!(
format!("{}", error),
vec![
" --> 2:2",
" |",
"2 | cd␊",
"3 | efgh",
" | ^^",
" |",
" = error: big one",
]
.join("\n")
);
}
#[test]
fn display_custom_span_three_lines() {
let input = "ab\ncd\nefgh";
let start = position::Position::new(input, 1).unwrap();
let end = position::Position::new(input, 9).unwrap();
let error: Error = Error::new_from_span(
ErrorVariant::CustomError {
message: "error: big one".to_owned(),
},
start.span(&end),
);
assert_eq!(
format!("{}", error),
vec![
" --> 1:2",
" |",
"1 | ab␊",
" | ...",
"3 | efgh",
" | ^^",
" |",
" = error: big one",
]
.join("\n")
);
}
#[test]
fn display_custom_span_two_lines_inverted_cols() {
let input = "abcdef\ngh";
let start = position::Position::new(input, 5).unwrap();
let end = position::Position::new(input, 8).unwrap();
let error: Error = Error::new_from_span(
ErrorVariant::CustomError {
message: "error: big one".to_owned(),
},
start.span(&end),
);
assert_eq!(
format!("{}", error),
vec![
" --> 1:6",
" |",
"1 | abcdef␊",
"2 | gh",
" | ^----^",
" |",
" = error: big one",
]
.join("\n")
);
}
#[test]
fn display_custom_span_end_after_newline() {
let input = "abcdef\n";
let start = position::Position::new(input, 0).unwrap();
let end = position::Position::new(input, 7).unwrap();
assert!(start.at_start());
assert!(end.at_end());
let error: Error = Error::new_from_span(
ErrorVariant::CustomError {
message: "error: big one".to_owned(),
},
start.span(&end),
);
assert_eq!(
format!("{}", error),
vec![
" --> 1:1",
" |",
"1 | abcdef␊",
" | ^-----^",
" |",
" = error: big one",
]
.join("\n")
);
}
#[test]
fn display_custom_span_empty() {
let input = "";
let start = position::Position::new(input, 0).unwrap();
let end = position::Position::new(input, 0).unwrap();
assert!(start.at_start());
assert!(end.at_end());
let error: Error = Error::new_from_span(
ErrorVariant::CustomError {
message: "error: empty".to_owned(),
},
start.span(&end),
);
assert_eq!(
format!("{}", error),
vec![
" --> 1:1",
" |",
"1 | ",
" | ^",
" |",
" = error: empty",
]
.join("\n")
);
}
#[test]
fn mapped_parsing_error() {
let input = "ab\ncd\nef";
let pos = position::Position::new(input, 4).unwrap();
let error: Error = Error::new_from_pos(
ErrorVariant::ParsingError {
positives: vec![1, 2, 3],
negatives: vec![4, 5, 6],
},
pos,
)
.renamed_rules(|n| format!("{}", n + 1));
assert_eq!(
format!("{}", error),
vec![
" --> 2:2",
" |",
"2 | cd␊",
" | ^---",
" |",
" = unexpected 5, 6, or 7; expected 2, 3, or 4",
]
.join("\n")
);
}
#[test]
fn error_with_path() {
let input = "ab\ncd\nef";
let pos = position::Position::new(input, 4).unwrap();
let error: Error = Error::new_from_pos(
ErrorVariant::ParsingError {
positives: vec![1, 2, 3],
negatives: vec![4, 5, 6],
},
pos,
)
.with_path("file.rs");
assert_eq!(
format!("{}", error),
vec![
" --> file.rs:2:2",
" |",
"2 | cd␊",
" | ^---",
" |",
" = unexpected 4, 5, or 6; expected 1, 2, or 3",
]
.join("\n")
);
}
#[test]
fn underline_with_tabs() {
let input = "a\txbc";
let pos = position::Position::new(input, 2).unwrap();
let error: Error = Error::new_from_pos(
ErrorVariant::ParsingError {
positives: vec![1, 2, 3],
negatives: vec![4, 5, 6],
},
pos,
)
.with_path("file.rs");
assert_eq!(
format!("{}", error),
vec![
" --> file.rs:1:3",
" |",
"1 | a xbc",
" | ^---",
" |",
" = unexpected 4, 5, or 6; expected 1, 2, or 3",
]
.join("\n")
);
}
}
pest-2.1.2/src/iterators/flat_pairs.rs 0100644 0000765 0000024 00000010736 13455165221 0016216 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
use std::fmt;
use std::rc::Rc;
use super::pair::{self, Pair};
use super::queueable_token::QueueableToken;
use super::tokens::{self, Tokens};
use RuleType;
/// An iterator over [`Pair`]s. It is created by [`Pairs::flatten`].
///
/// [`Pair`]: struct.Pair.html
/// [`Pairs::flatten`]: struct.Pairs.html#method.flatten
pub struct FlatPairs<'i, R> {
/// # Safety
///
/// All `QueueableToken`s' `input_pos` must be valid character boundary indices into `input`.
queue: Rc>>,
input: &'i str,
start: usize,
end: usize,
}
/// # Safety
///
/// All `QueueableToken`s' `input_pos` must be valid character boundary indices into `input`.
pub unsafe fn new(
queue: Rc>>,
input: &str,
start: usize,
end: usize,
) -> FlatPairs {
FlatPairs {
queue,
input,
start,
end,
}
}
impl<'i, R: RuleType> FlatPairs<'i, R> {
/// Returns the `Tokens` for these pairs.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a
/// }
///
/// let input = "";
/// let pairs = pest::state(input, |state| {
/// // generating Token pair with Rule::a ...
/// # state.rule(Rule::a, |s| Ok(s))
/// }).unwrap();
/// let tokens: Vec<_> = pairs.flatten().tokens().collect();
///
/// assert_eq!(tokens.len(), 2);
/// ```
#[inline]
pub fn tokens(self) -> Tokens<'i, R> {
tokens::new(self.queue, self.input, self.start, self.end)
}
fn next_start(&mut self) {
self.start += 1;
while self.start < self.end && !self.is_start(self.start) {
self.start += 1;
}
}
fn next_start_from_end(&mut self) {
self.end -= 1;
while self.end >= self.start && !self.is_start(self.end) {
self.end -= 1;
}
}
fn is_start(&self, index: usize) -> bool {
match self.queue[index] {
QueueableToken::Start { .. } => true,
QueueableToken::End { .. } => false,
}
}
}
impl<'i, R: RuleType> Iterator for FlatPairs<'i, R> {
type Item = Pair<'i, R>;
fn next(&mut self) -> Option {
if self.start >= self.end {
return None;
}
let pair = unsafe { pair::new(Rc::clone(&self.queue), self.input, self.start) };
self.next_start();
Some(pair)
}
}
impl<'i, R: RuleType> DoubleEndedIterator for FlatPairs<'i, R> {
fn next_back(&mut self) -> Option {
if self.end <= self.start {
return None;
}
self.next_start_from_end();
let pair = unsafe { pair::new(Rc::clone(&self.queue), self.input, self.end) };
Some(pair)
}
}
impl<'i, R: RuleType> fmt::Debug for FlatPairs<'i, R> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.debug_struct("FlatPairs")
.field("pairs", &self.clone().collect::>())
.finish()
}
}
impl<'i, R: Clone> Clone for FlatPairs<'i, R> {
fn clone(&self) -> FlatPairs<'i, R> {
FlatPairs {
queue: Rc::clone(&self.queue),
input: self.input,
start: self.start,
end: self.end,
}
}
}
#[cfg(test)]
mod tests {
use super::super::super::macros::tests::*;
use super::super::super::Parser;
#[test]
fn iter_for_flat_pairs() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
assert_eq!(
pairs.flatten().map(|p| p.as_rule()).collect::>(),
vec![Rule::a, Rule::b, Rule::c]
);
}
#[test]
fn double_ended_iter_for_flat_pairs() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
assert_eq!(
pairs
.flatten()
.rev()
.map(|p| p.as_rule())
.collect::>(),
vec![Rule::c, Rule::b, Rule::a]
);
}
}
pest-2.1.2/src/iterators/mod.rs 0100644 0000765 0000024 00000001304 13357776302 0014651 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
//! Types and iterators for parser output.
mod flat_pairs;
mod pair;
pub(crate) mod pairs;
mod queueable_token;
mod tokens;
pub use self::flat_pairs::FlatPairs;
pub use self::pair::Pair;
pub use self::pairs::Pairs;
pub(crate) use self::queueable_token::QueueableToken;
pub use self::tokens::Tokens;
pest-2.1.2/src/iterators/pair.rs 0100644 0000765 0000024 00000025126 13455165221 0015024 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
use std::fmt;
use std::hash::{Hash, Hasher};
use std::ptr;
use std::rc::Rc;
use std::str;
#[cfg(feature = "pretty-print")]
use serde::ser::SerializeStruct;
use super::pairs::{self, Pairs};
use super::queueable_token::QueueableToken;
use super::tokens::{self, Tokens};
use span::{self, Span};
use RuleType;
/// A matching pair of [`Token`]s and everything between them.
///
/// A matching `Token` pair is formed by a `Token::Start` and a subsequent `Token::End` with the
/// same `Rule`, with the condition that all `Token`s between them can form such pairs as well.
/// This is similar to the [brace matching problem](https://en.wikipedia.org/wiki/Brace_matching) in
/// editors.
///
/// [`Token`]: ../enum.Token.html
#[derive(Clone)]
pub struct Pair<'i, R> {
/// # Safety
///
/// All `QueueableToken`s' `input_pos` must be valid character boundary indices into `input`.
queue: Rc>>,
input: &'i str,
/// Token index into `queue`.
start: usize,
}
/// # Safety
///
/// All `QueueableToken`s' `input_pos` must be valid character boundary indices into `input`.
pub unsafe fn new(
queue: Rc>>,
input: &str,
start: usize,
) -> Pair {
Pair {
queue,
input,
start,
}
}
impl<'i, R: RuleType> Pair<'i, R> {
/// Returns the `Rule` of the `Pair`.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a
/// }
///
/// let input = "";
/// let pair = pest::state(input, |state| {
/// // generating Token pair with Rule::a ...
/// # state.rule(Rule::a, |s| Ok(s))
/// }).unwrap().next().unwrap();
///
/// assert_eq!(pair.as_rule(), Rule::a);
/// ```
#[inline]
pub fn as_rule(&self) -> R {
match self.queue[self.pair()] {
QueueableToken::End { rule, .. } => rule,
_ => unreachable!(),
}
}
/// Captures a slice from the `&str` defined by the token `Pair`.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// ab
/// }
///
/// let input = "ab";
/// let pair = pest::state(input, |state| {
/// // generating Token pair with Rule::ab ...
/// # state.rule(Rule::ab, |s| s.match_string("ab"))
/// }).unwrap().next().unwrap();
///
/// assert_eq!(pair.as_str(), "ab");
/// ```
#[inline]
pub fn as_str(&self) -> &'i str {
let start = self.pos(self.start);
let end = self.pos(self.pair());
// Generated positions always come from Positions and are UTF-8 borders.
&self.input[start..end]
}
/// Returns the `Span` defined by the `Pair`, consuming it.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// ab
/// }
///
/// let input = "ab";
/// let pair = pest::state(input, |state| {
/// // generating Token pair with Rule::ab ...
/// # state.rule(Rule::ab, |s| s.match_string("ab"))
/// }).unwrap().next().unwrap();
///
/// assert_eq!(pair.into_span().as_str(), "ab");
/// ```
#[inline]
#[deprecated(since = "2.0.0", note = "Please use `as_span` instead")]
pub fn into_span(self) -> Span<'i> {
self.as_span()
}
/// Returns the `Span` defined by the `Pair`, **without** consuming it.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// ab
/// }
///
/// let input = "ab";
/// let pair = pest::state(input, |state| {
/// // generating Token pair with Rule::ab ...
/// # state.rule(Rule::ab, |s| s.match_string("ab"))
/// }).unwrap().next().unwrap();
///
/// assert_eq!(pair.as_span().as_str(), "ab");
/// ```
#[inline]
pub fn as_span(&self) -> Span<'i> {
let start = self.pos(self.start);
let end = self.pos(self.pair());
// Generated positions always come from Positions and are UTF-8 borders.
unsafe { span::Span::new_unchecked(self.input, start, end) }
}
/// Returns the inner `Pairs` between the `Pair`, consuming it.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a
/// }
///
/// let input = "";
/// let pair = pest::state(input, |state| {
/// // generating Token pair with Rule::a ...
/// # state.rule(Rule::a, |s| Ok(s))
/// }).unwrap().next().unwrap();
///
/// assert!(pair.into_inner().next().is_none());
/// ```
#[inline]
pub fn into_inner(self) -> Pairs<'i, R> {
let pair = self.pair();
pairs::new(self.queue, self.input, self.start + 1, pair)
}
/// Returns the `Tokens` for the `Pair`.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a
/// }
///
/// let input = "";
/// let pair = pest::state(input, |state| {
/// // generating Token pair with Rule::a ...
/// # state.rule(Rule::a, |s| Ok(s))
/// }).unwrap().next().unwrap();
/// let tokens: Vec<_> = pair.tokens().collect();
///
/// assert_eq!(tokens.len(), 2);
/// ```
#[inline]
pub fn tokens(self) -> Tokens<'i, R> {
let end = self.pair();
tokens::new(self.queue, self.input, self.start, end + 1)
}
/// Generates a string that stores the lexical information of `self` in
/// a pretty-printed JSON format.
#[cfg(feature = "pretty-print")]
pub fn to_json(&self) -> String {
::serde_json::to_string_pretty(self).expect("Failed to pretty-print Pair to json.")
}
fn pair(&self) -> usize {
match self.queue[self.start] {
QueueableToken::Start {
end_token_index, ..
} => end_token_index,
_ => unreachable!(),
}
}
fn pos(&self, index: usize) -> usize {
match self.queue[index] {
QueueableToken::Start { input_pos, .. } | QueueableToken::End { input_pos, .. } => {
input_pos
}
}
}
}
impl<'i, R: RuleType> Pairs<'i, R> {
/// Create a new `Pairs` iterator containing just the single `Pair`.
pub fn single(pair: Pair<'i, R>) -> Self {
let end = pair.pair();
pairs::new(pair.queue, pair.input, pair.start, end)
}
}
impl<'i, R: RuleType> fmt::Debug for Pair<'i, R> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.debug_struct("Pair")
.field("rule", &self.as_rule())
.field("span", &self.as_span())
.field("inner", &self.clone().into_inner().collect::>())
.finish()
}
}
impl<'i, R: RuleType> fmt::Display for Pair<'i, R> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let rule = self.as_rule();
let start = self.pos(self.start);
let end = self.pos(self.pair());
let mut pairs = self.clone().into_inner().peekable();
if pairs.peek().is_none() {
write!(f, "{:?}({}, {})", rule, start, end)
} else {
write!(
f,
"{:?}({}, {}, [{}])",
rule,
start,
end,
pairs
.map(|pair| format!("{}", pair))
.collect::>()
.join(", ")
)
}
}
}
impl<'i, R: PartialEq> PartialEq for Pair<'i, R> {
fn eq(&self, other: &Pair<'i, R>) -> bool {
Rc::ptr_eq(&self.queue, &other.queue)
&& ptr::eq(self.input, other.input)
&& self.start == other.start
}
}
impl<'i, R: Eq> Eq for Pair<'i, R> {}
impl<'i, R: Hash> Hash for Pair<'i, R> {
fn hash(&self, state: &mut H) {
(&*self.queue as *const Vec>).hash(state);
(self.input as *const str).hash(state);
self.start.hash(state);
}
}
#[cfg(feature = "pretty-print")]
impl<'i, R: RuleType> ::serde::Serialize for Pair<'i, R> {
fn serialize(&self, serializer: S) -> Result
where
S: ::serde::Serializer,
{
let start = self.pos(self.start);
let end = self.pos(self.pair());
let rule = format!("{:?}", self.as_rule());
let inner = self.clone().into_inner();
let mut ser = serializer.serialize_struct("Pairs", 3)?;
ser.serialize_field("pos", &(start, end))?;
ser.serialize_field("rule", &rule)?;
if inner.peek().is_none() {
ser.serialize_field("inner", &self.as_str())?;
} else {
ser.serialize_field("inner", &inner)?;
}
ser.end()
}
}
#[cfg(test)]
mod tests {
use macros::tests::*;
use parser::Parser;
#[test]
#[cfg(feature = "pretty-print")]
fn test_pretty_print() {
let pair = AbcParser::parse(Rule::a, "abcde").unwrap().next().unwrap();
let expected = r#"{
"pos": [
0,
3
],
"rule": "a",
"inner": {
"pos": [
1,
2
],
"pairs": [
{
"pos": [
1,
2
],
"rule": "b",
"inner": "b"
}
]
}
}"#;
assert_eq!(expected, pair.to_json());
}
#[test]
fn pair_into_inner() {
let pair = AbcParser::parse(Rule::a, "abcde").unwrap().next().unwrap(); // the tokens a(b())
let pairs = pair.into_inner(); // the tokens b()
assert_eq!(2, pairs.tokens().count());
}
}
pest-2.1.2/src/iterators/pairs.rs 0100644 0000765 0000024 00000025734 13533150510 0015204 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
use std::fmt;
use std::hash::{Hash, Hasher};
use std::ptr;
use std::rc::Rc;
use std::str;
#[cfg(feature = "pretty-print")]
use serde::ser::SerializeStruct;
use super::flat_pairs::{self, FlatPairs};
use super::pair::{self, Pair};
use super::queueable_token::QueueableToken;
use super::tokens::{self, Tokens};
use RuleType;
/// An iterator over [`Pair`]s. It is created by [`pest::state`] and [`Pair::into_inner`].
///
/// [`Pair`]: struct.Pair.html
/// [`pest::state`]: ../fn.state.html
/// [`Pair::into_inner`]: struct.Pair.html#method.into_inner
#[derive(Clone)]
pub struct Pairs<'i, R> {
queue: Rc>>,
input: &'i str,
start: usize,
end: usize,
}
pub fn new(
queue: Rc>>,
input: &str,
start: usize,
end: usize,
) -> Pairs {
Pairs {
queue,
input,
start,
end,
}
}
impl<'i, R: RuleType> Pairs<'i, R> {
/// Captures a slice from the `&str` defined by the starting position of the first token `Pair`
/// and the ending position of the last token `Pair` of the `Pairs`. This also captures
/// the input between those two token `Pair`s.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a,
/// b
/// }
///
/// let input = "a b";
/// let pairs = pest::state(input, |state| {
/// // generating Token pairs with Rule::a and Rule::b ...
/// # state.rule(Rule::a, |s| s.match_string("a")).and_then(|s| s.skip(1))
/// # .and_then(|s| s.rule(Rule::b, |s| s.match_string("b")))
/// }).unwrap();
///
/// assert_eq!(pairs.as_str(), "a b");
/// ```
#[inline]
pub fn as_str(&self) -> &'i str {
if self.start < self.end {
let start = self.pos(self.start);
let end = self.pos(self.end - 1);
// Generated positions always come from Positions and are UTF-8 borders.
&self.input[start..end]
} else {
""
}
}
/// Captures inner token `Pair`s and concatenates resulting `&str`s. This does not capture
/// the input between token `Pair`s.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a,
/// b
/// }
///
/// let input = "a b";
/// let pairs = pest::state(input, |state| {
/// // generating Token pairs with Rule::a and Rule::b ...
/// # state.rule(Rule::a, |s| s.match_string("a")).and_then(|s| s.skip(1))
/// # .and_then(|s| s.rule(Rule::b, |s| s.match_string("b")))
/// }).unwrap();
///
/// assert_eq!(pairs.concat(), "ab");
/// ```
#[inline]
pub fn concat(&self) -> String {
self.clone()
.fold(String::new(), |string, pair| string + pair.as_str())
}
/// Flattens the `Pairs`.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a,
/// b
/// }
///
/// let input = "";
/// let pairs = pest::state(input, |state| {
/// // generating nested Token pair with Rule::b inside Rule::a
/// # state.rule(Rule::a, |state| {
/// # state.rule(Rule::b, |s| Ok(s))
/// # })
/// }).unwrap();
/// let tokens: Vec<_> = pairs.flatten().tokens().collect();
///
/// assert_eq!(tokens.len(), 4);
/// ```
#[inline]
pub fn flatten(self) -> FlatPairs<'i, R> {
unsafe { flat_pairs::new(self.queue, self.input, self.start, self.end) }
}
/// Returns the `Tokens` for the `Pairs`.
///
/// # Examples
///
/// ```
/// # use std::rc::Rc;
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a
/// }
///
/// let input = "";
/// let pairs = pest::state(input, |state| {
/// // generating Token pair with Rule::a ...
/// # state.rule(Rule::a, |s| Ok(s))
/// }).unwrap();
/// let tokens: Vec<_> = pairs.tokens().collect();
///
/// assert_eq!(tokens.len(), 2);
/// ```
#[inline]
pub fn tokens(self) -> Tokens<'i, R> {
tokens::new(self.queue, self.input, self.start, self.end)
}
/// Peek at the first inner `Pair` without changing the position of this iterator.
#[inline]
pub fn peek(&self) -> Option> {
if self.start < self.end {
Some(unsafe { pair::new(Rc::clone(&self.queue), self.input, self.start) })
} else {
None
}
}
/// Generates a string that stores the lexical information of `self` in
/// a pretty-printed JSON format.
#[cfg(feature = "pretty-print")]
pub fn to_json(&self) -> String {
::serde_json::to_string_pretty(self).expect("Failed to pretty-print Pairs to json.")
}
fn pair(&self) -> usize {
match self.queue[self.start] {
QueueableToken::Start {
end_token_index, ..
} => end_token_index,
_ => unreachable!(),
}
}
fn pair_from_end(&self) -> usize {
match self.queue[self.end - 1] {
QueueableToken::End {
start_token_index, ..
} => start_token_index,
_ => unreachable!(),
}
}
fn pos(&self, index: usize) -> usize {
match self.queue[index] {
QueueableToken::Start { input_pos, .. } | QueueableToken::End { input_pos, .. } => {
input_pos
}
}
}
}
impl<'i, R: RuleType> Iterator for Pairs<'i, R> {
type Item = Pair<'i, R>;
fn next(&mut self) -> Option {
let pair = self.peek()?;
self.start = self.pair() + 1;
Some(pair)
}
}
impl<'i, R: RuleType> DoubleEndedIterator for Pairs<'i, R> {
fn next_back(&mut self) -> Option {
if self.end <= self.start {
return None;
}
self.end = self.pair_from_end();
let pair = unsafe { pair::new(Rc::clone(&self.queue), self.input, self.end) };
Some(pair)
}
}
impl<'i, R: RuleType> fmt::Debug for Pairs<'i, R> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.debug_list().entries(self.clone()).finish()
}
}
impl<'i, R: RuleType> fmt::Display for Pairs<'i, R> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(
f,
"[{}]",
self.clone()
.map(|pair| format!("{}", pair))
.collect::>()
.join(", ")
)
}
}
impl<'i, R: PartialEq> PartialEq for Pairs<'i, R> {
fn eq(&self, other: &Pairs<'i, R>) -> bool {
Rc::ptr_eq(&self.queue, &other.queue)
&& ptr::eq(self.input, other.input)
&& self.start == other.start
&& self.end == other.end
}
}
impl<'i, R: Eq> Eq for Pairs<'i, R> {}
impl<'i, R: Hash> Hash for Pairs<'i, R> {
fn hash(&self, state: &mut H) {
(&*self.queue as *const Vec>).hash(state);
(self.input as *const str).hash(state);
self.start.hash(state);
self.end.hash(state);
}
}
#[cfg(feature = "pretty-print")]
impl<'i, R: RuleType> ::serde::Serialize for Pairs<'i, R> {
fn serialize(&self, serializer: S) -> Result
where
S: ::serde::Serializer,
{
let start = self.pos(self.start);
let end = self.pos(self.end - 1);
let pairs = self.clone().collect::>();
let mut ser = serializer.serialize_struct("Pairs", 2)?;
ser.serialize_field("pos", &(start, end))?;
ser.serialize_field("pairs", &pairs)?;
ser.end()
}
}
#[cfg(test)]
mod tests {
use super::super::super::macros::tests::*;
use super::super::super::Parser;
#[test]
#[cfg(feature = "pretty-print")]
fn test_pretty_print() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
let expected = r#"{
"pos": [
0,
5
],
"pairs": [
{
"pos": [
0,
3
],
"rule": "a",
"inner": {
"pos": [
1,
2
],
"pairs": [
{
"pos": [
1,
2
],
"rule": "b",
"inner": "b"
}
]
}
},
{
"pos": [
4,
5
],
"rule": "c",
"inner": "e"
}
]
}"#;
assert_eq!(expected, pairs.to_json());
}
#[test]
fn as_str() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
assert_eq!(pairs.as_str(), "abcde");
}
#[test]
fn as_str_empty() {
let mut pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
assert_eq!(pairs.nth(1).unwrap().into_inner().as_str(), "");
}
#[test]
fn concat() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
assert_eq!(pairs.concat(), "abce");
}
#[test]
fn pairs_debug() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
#[rustfmt::skip]
assert_eq!(
format!("{:?}", pairs),
"[\
Pair { rule: a, span: Span { str: \"abc\", start: 0, end: 3 }, inner: [\
Pair { rule: b, span: Span { str: \"b\", start: 1, end: 2 }, inner: [] }\
] }, \
Pair { rule: c, span: Span { str: \"e\", start: 4, end: 5 }, inner: [] }\
]"
.to_owned()
);
}
#[test]
fn pairs_display() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
assert_eq!(
format!("{}", pairs),
"[a(0, 3, [b(1, 2)]), c(4, 5)]".to_owned()
);
}
#[test]
fn iter_for_pairs() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
assert_eq!(
pairs.map(|p| p.as_rule()).collect::>(),
vec![Rule::a, Rule::c]
);
}
#[test]
fn double_ended_iter_for_pairs() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
assert_eq!(
pairs.rev().map(|p| p.as_rule()).collect::>(),
vec![Rule::c, Rule::a]
);
}
}
pest-2.1.2/src/iterators/queueable_token.rs 0100644 0000765 0000024 00000001765 13407230676 0017250 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
// This structure serves to improve performance over Token objects in two ways:
//
// * it is smaller than a Token, leading to both less memory use when stored in the queue but also
// increased speed when pushing to the queue
// * it finds its pair in O(1) time instead of O(N), since pair positions are known at parse time
// and can easily be stored instead of recomputed
#[derive(Debug)]
pub enum QueueableToken {
Start {
end_token_index: usize,
input_pos: usize,
},
End {
start_token_index: usize,
rule: R,
input_pos: usize,
},
}
pest-2.1.2/src/iterators/tokens.rs 0100644 0000765 0000024 00000007662 13407230676 0015405 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
use std::fmt;
use std::rc::Rc;
use std::str;
use super::queueable_token::QueueableToken;
use position;
use token::Token;
use RuleType;
/// An iterator over [`Token`]s. It is created by [`Pair::tokens`] and [`Pairs::tokens`].
///
/// [`Token`]: ../enum.Token.html
/// [`Pair::tokens`]: struct.Pair.html#method.tokens
/// [`Pairs::tokens`]: struct.Pairs.html#method.tokens
#[derive(Clone)]
pub struct Tokens<'i, R> {
/// # Safety:
///
/// All `QueueableToken`s' `input_pos` must be valid character boundary indices into `input`.
queue: Rc>>,
input: &'i str,
start: usize,
end: usize,
}
// TODO(safety): QueueableTokens must be valid indices into input.
pub fn new(
queue: Rc>>,
input: &str,
start: usize,
end: usize,
) -> Tokens {
if cfg!(debug_assertions) {
for tok in queue.iter() {
match *tok {
QueueableToken::Start { input_pos, .. } | QueueableToken::End { input_pos, .. } => {
assert!(
input.get(input_pos..).is_some(),
"💥 UNSAFE `Tokens` CREATED 💥"
)
}
}
}
}
Tokens {
queue,
input,
start,
end,
}
}
impl<'i, R: RuleType> Tokens<'i, R> {
fn create_token(&self, index: usize) -> Token<'i, R> {
match self.queue[index] {
QueueableToken::Start {
end_token_index,
input_pos,
} => {
let rule = match self.queue[end_token_index] {
QueueableToken::End { rule, .. } => rule,
_ => unreachable!(),
};
Token::Start {
rule,
// QueueableTokens are safely created.
pos: unsafe { position::Position::new_unchecked(self.input, input_pos) },
}
}
QueueableToken::End {
rule, input_pos, ..
} => {
Token::End {
rule,
// QueueableTokens are safely created.
pos: unsafe { position::Position::new_unchecked(self.input, input_pos) },
}
}
}
}
}
impl<'i, R: RuleType> Iterator for Tokens<'i, R> {
type Item = Token<'i, R>;
fn next(&mut self) -> Option {
if self.start >= self.end {
return None;
}
let token = self.create_token(self.start);
self.start += 1;
Some(token)
}
}
impl<'i, R: RuleType> DoubleEndedIterator for Tokens<'i, R> {
fn next_back(&mut self) -> Option {
if self.end <= self.start {
return None;
}
let token = self.create_token(self.end - 1);
self.end -= 1;
Some(token)
}
}
impl<'i, R: RuleType> fmt::Debug for Tokens<'i, R> {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.debug_list().entries(self.clone()).finish()
}
}
#[cfg(test)]
mod tests {
use super::super::super::macros::tests::*;
use super::super::super::Parser;
use super::Token;
#[test]
fn double_ended_iter_for_tokens() {
let pairs = AbcParser::parse(Rule::a, "abcde").unwrap();
let mut tokens = pairs.clone().tokens().collect::>>();
tokens.reverse();
let reverse_tokens = pairs.tokens().rev().collect::>>();
assert_eq!(tokens, reverse_tokens);
}
}
pest-2.1.2/src/lib.rs 0100644 0000765 0000024 00000006536 13455165221 0012627 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
//! # pest. The Elegant Parser
//!
//! pest is a general purpose parser written in Rust with a focus on accessibility, correctness,
//! and performance. It uses parsing expression grammars (or [PEG]) as input, which are similar in
//! spirit to regular expressions, but which offer the enhanced expressivity needed to parse
//! complex languages.
//!
//! [PEG]: https://en.wikipedia.org/wiki/Parsing_expression_grammar
//!
//! ## Getting started
//!
//! The recommended way to start parsing with pest is to read the official [book].
//!
//! Other helpful resources:
//!
//! * API reference on [docs.rs]
//! * play with grammars and share them on our [fiddle]
//! * leave feedback, ask questions, or greet us on [Gitter]
//!
//! [book]: https://pest-parser.github.io/book
//! [docs.rs]: https://docs.rs/pest
//! [fiddle]: https://pest-parser.github.io/#editor
//! [Gitter]: https://gitter.im/dragostis/pest
//!
//! ## Usage
//!
//! The core of pest is the trait [`Parser`], which provides an interface to the parsing
//! functionality.
//!
//! The accompanying crate `pest_derive` can automatically generate a [`Parser`] from a PEG
//! grammar. Using `pest_derive` is highly encouraged, but it is also possible to implement
//! [`Parser`] manually if required.
//!
//! ## `.pest` files
//!
//! Grammar definitions reside in custom `.pest` files located in the crate `src` directory.
//! Parsers are automatically generated from these files using `#[derive(Parser)]` and a special
//! `#[grammar = "..."]` attribute on a dummy struct.
//!
//! ```ignore
//! #[derive(Parser)]
//! #[grammar = "path/to/my_grammar.pest"] // relative to src
//! struct MyParser;
//! ```
//!
//! The syntax of `.pest` files is documented in the [`pest_derive` crate].
//!
//! ## Inline grammars
//!
//! Grammars can also be inlined by using the `#[grammar_inline = "..."]` attribute.
//!
//! [`Parser`]: trait.Parser.html
//! [`pest_derive` crate]: https://docs.rs/pest_derive/
#![doc(html_root_url = "https://docs.rs/pest")]
extern crate ucd_trie;
#[cfg(feature = "pretty-print")]
extern crate serde;
#[cfg(feature = "pretty-print")]
extern crate serde_json;
pub use parser::Parser;
pub use parser_state::{state, Atomicity, Lookahead, MatchDir, ParseResult, ParserState};
pub use position::Position;
pub use span::{Lines, Span};
use std::fmt::Debug;
use std::hash::Hash;
pub use token::Token;
pub mod error;
pub mod iterators;
mod macros;
mod parser;
mod parser_state;
mod position;
pub mod prec_climber;
mod span;
mod stack;
mod token;
#[doc(hidden)]
pub mod unicode;
/// A trait which parser rules must implement.
///
/// This trait is set up so that any struct that implements all of its required traits will
/// automatically implement this trait as well.
///
/// This is essentially a [trait alias](https://github.com/rust-lang/rfcs/pull/1733). When trait
/// aliases are implemented, this may be replaced by one.
pub trait RuleType: Copy + Debug + Eq + Hash + Ord {}
impl RuleType for T {}
pest-2.1.2/src/macros.rs 0100644 0000765 0000024 00000036121 13407230676 0013342 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
#[doc(hidden)]
#[macro_export]
macro_rules! consumes_to {
( $_rules:ident, $tokens:expr, [] ) => ();
( $rules:ident, $tokens:expr, [ $name:ident ( $start:expr, $end:expr ) ] ) => {
let expected = format!("expected Start {{ rule: {:?}, pos: Position {{ pos: {} }} }}",
$rules::$name, $start);
match $tokens.next().expect(&format!("{} but found nothing", expected)) {
$crate::Token::Start { rule, pos } => {
let message = format!("{} but found Start {{ rule: {:?}, pos: Position {{ {} }} }}",
expected, rule, pos.pos());
if rule != $rules::$name || pos.pos() != $start {
panic!("{}", message);
}
},
token => panic!("{}", format!("{} but found {:?}", expected, token))
};
let expected = format!("expected End {{ rule: {:?}, pos: Position {{ pos: {} }} }}",
$rules::$name, $end);
match $tokens.next().expect(&format!("{} but found nothing", expected)) {
$crate::Token::End { rule, pos } => {
let message = format!("{} but found End {{ rule: {:?}, pos: Position {{ {} }} }}",
expected, rule, pos.pos());
if rule != $rules::$name || pos.pos() != $end {
panic!("{}", message);
}
},
token => panic!("{}", format!("{} but found {:?}", expected, token))
};
};
( $rules:ident, $tokens:expr, [ $name:ident ( $start:expr, $end:expr ),
$( $names:ident $calls:tt ),* $(,)* ] ) => {
let expected = format!("expected Start {{ rule: {:?}, pos: Position {{ pos: {} }} }}",
$rules::$name, $start);
match $tokens.next().expect(&format!("{} but found nothing", expected)) {
$crate::Token::Start { rule, pos } => {
let message = format!("{} but found Start {{ rule: {:?}, pos: Position {{ {} }} }}",
expected, rule, pos.pos());
if rule != $rules::$name || pos.pos() != $start {
panic!("{}", message);
}
},
token => panic!("{}", format!("{} but found {:?}", expected, token))
};
let expected = format!("expected End {{ rule: {:?}, pos: Position {{ pos: {} }} }}",
$rules::$name, $end);
match $tokens.next().expect(&format!("{} but found nothing", expected)) {
$crate::Token::End { rule, pos } => {
let message = format!("{} but found End {{ rule: {:?}, pos: Position {{ {} }} }}",
expected, rule, pos.pos());
if rule != $rules::$name || pos.pos() != $end {
panic!("{}", message);
}
},
token => panic!("{}", format!("{} but found {:?}", expected, token))
};
consumes_to!($rules, $tokens, [ $( $names $calls ),* ]);
};
( $rules:ident, $tokens:expr, [ $name:ident ( $start:expr, $end:expr,
[ $( $names:ident $calls:tt ),* $(,)* ] ) ] ) => {
let expected = format!("expected Start {{ rule: {:?}, pos: Position {{ pos: {} }} }}",
$rules::$name, $start);
match $tokens.next().expect(&format!("{} but found nothing", expected)) {
$crate::Token::Start { rule, pos } => {
let message = format!("{} but found Start {{ rule: {:?}, pos: Position {{ {} }} }}",
expected, rule, pos.pos());
if rule != $rules::$name || pos.pos() != $start {
panic!("{}", message);
}
},
token => panic!("{}", format!("{} but found {:?}", expected, token))
};
consumes_to!($rules, $tokens, [ $( $names $calls ),* ]);
let expected = format!("expected End {{ rule: {:?}, pos: Position {{ pos: {} }} }}",
$rules::$name, $end);
match $tokens.next().expect(&format!("{} but found nothing", expected)) {
$crate::Token::End { rule, pos } => {
let message = format!("{} but found End {{ rule: {:?}, pos: Position {{ {} }} }}",
expected, rule, pos.pos());
if rule != $rules::$name || pos.pos() != $end {
panic!("{}", message);
}
},
token => panic!("{}", format!("{} but found {:?}", expected, token))
};
};
( $rules:ident, $tokens:expr, [ $name:ident ( $start:expr, $end:expr,
[ $( $nested_names:ident $nested_calls:tt ),*
$(,)* ] ),
$( $names:ident $calls:tt ),* ] ) => {
let expected = format!("expected Start {{ rule: {:?}, pos: Position {{ pos: {} }} }}",
$rules::$name, $start);
match $tokens.next().expect(&format!("{} but found nothing", expected)) {
$crate::Token::Start { rule, pos } => {
let message = format!("{} but found Start {{ rule: {:?}, pos: Position {{ {} }} }}",
expected, rule, pos.pos());
if rule != $rules::$name || pos.pos() != $start {
panic!("{}", message);
}
},
token => panic!("{}", format!("{} but found {:?}", expected, token))
};
consumes_to!($rules, $tokens, [ $( $nested_names $nested_calls ),* ]);
let expected = format!("expected End {{ rule: {:?}, pos: Position {{ pos: {} }} }}",
$rules::$name, $end);
match $tokens.next().expect(&format!("{} but found nothing", expected)) {
$crate::Token::End { rule, pos } => {
let message = format!("{} but found End {{ rule: {:?}, pos: Position {{ {} }} }}",
expected, rule, pos.pos());
if rule != $rules::$name || pos.pos() != $end {
panic!("{}", message);
}
},
token => panic!("{}", format!("{} but found {:?}", expected, token))
};
consumes_to!($rules, $tokens, [ $( $names $calls ),* ]);
};
}
/// Testing tool that compares produced tokens.
///
/// This macro takes several arguments:
///
/// * `parser` - name of the data structure implementing `Parser`
/// * `input` - input to be tested against
/// * `rule` - `Rule` which will be run
/// * `tokens` - token pairs of the form `name(start_pos, end_pos, [nested_child_tokens])`
///
/// *Note:* `start_pos` and `end_pos` are byte positions.
///
/// # Examples
///
/// ```
/// # #[macro_use]
/// # extern crate pest;
/// # use pest::Parser;
/// # use pest::error::Error;
/// # use pest::iterators::Pairs;
/// # fn main() {
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// # enum Rule {
/// # a,
/// # b,
/// # c
/// # }
/// #
/// # struct AbcParser;
/// #
/// # impl Parser for AbcParser {
/// # fn parse<'i>(_: Rule, input: &'i str) -> Result, Error> {
/// # pest::state(input, |state| {
/// # state.rule(Rule::a, |state| {
/// # state.skip(1).unwrap().rule(Rule::b, |state| {
/// # state.skip(1)
/// # }).unwrap().skip(1)
/// # }).and_then(|state| {
/// # state.skip(1).unwrap().rule(Rule::c, |state| {
/// # state.skip(1)
/// # })
/// # })
/// # })
/// # }
/// # }
/// parses_to! {
/// parser: AbcParser,
/// input: "abcde",
/// rule: Rule::a,
/// tokens: [
/// a(0, 3, [
/// b(1, 2)
/// ]),
/// c(4, 5)
/// ]
/// };
/// # }
/// ```
#[macro_export]
macro_rules! parses_to {
( parser: $parser:ident, input: $string:expr, rule: $rules:tt :: $rule:tt,
tokens: [ $( $names:ident $calls:tt ),* $(,)* ] ) => {
#[allow(unused_mut)]
{
use $crate::Parser;
let mut tokens = $parser::parse($rules::$rule, $string).unwrap().tokens();
consumes_to!($rules, &mut tokens, [ $( $names $calls ),* ]);
let rest: Vec<_> = tokens.collect();
match rest.len() {
0 => (),
2 => {
let (first, second) = (&rest[0], &rest[1]);
match (first, second) {
(
&$crate::Token::Start { rule: ref first_rule, .. },
&$crate::Token::End { rule: ref second_rule, .. }
) => {
assert!(
format!("{:?}", first_rule) == "EOI",
format!("expected end of input, but found {:?}", rest)
);
assert!(
format!("{:?}", second_rule) == "EOI",
format!("expected end of input, but found {:?}", rest)
);
}
_ => panic!("expected end of input, but found {:?}", rest)
}
}
_ => panic!("expected end of input, but found {:?}", rest)
};
}
};
}
/// Testing tool that compares produced errors.
///
/// This macro takes several arguments:
///
/// * `parser` - name of the data structure implementing `Parser`
/// * `input` - input to be tested against
/// * `rule` - `Rule` which will be run
/// * `positives` - positive `Rule` attempts that failed
/// * `negative` - negative `Rule` attempts that failed
/// * `pos` - byte position of failure
///
/// # Examples
///
/// ```
/// # #[macro_use]
/// # extern crate pest;
/// # use pest::Parser;
/// # use pest::error::Error;
/// # use pest::iterators::Pairs;
/// # fn main() {
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// # enum Rule {
/// # a,
/// # b,
/// # c
/// # }
/// #
/// # struct AbcParser;
/// #
/// # impl Parser for AbcParser {
/// # fn parse<'i>(_: Rule, input: &'i str) -> Result, Error> {
/// # pest::state(input, |state| {
/// # state.rule(Rule::a, |state| {
/// # state.skip(1).unwrap().rule(Rule::b, |s| {
/// # s.skip(1)
/// # }).unwrap().skip(1)
/// # }).and_then(|state| {
/// # state.skip(1).unwrap().rule(Rule::c, |s| {
/// # s.match_string("e")
/// # })
/// # })
/// # })
/// # }
/// # }
/// fails_with! {
/// parser: AbcParser,
/// input: "abcdf",
/// rule: Rule::a,
/// positives: vec![Rule::c],
/// negatives: vec![],
/// pos: 4
/// };
/// # }
/// ```
#[macro_export]
macro_rules! fails_with {
( parser: $parser:ident, input: $string:expr, rule: $rules:tt :: $rule:tt,
positives: $positives:expr, negatives: $negatives:expr, pos: $pos:expr ) => {
#[allow(unused_mut)]
{
use $crate::Parser;
let error = $parser::parse($rules::$rule, $string).unwrap_err();
match error.variant {
$crate::error::ErrorVariant::ParsingError {
positives,
negatives,
} => {
assert_eq!(positives, $positives);
assert_eq!(negatives, $negatives);
}
_ => unreachable!(),
};
match error.location {
$crate::error::InputLocation::Pos(pos) => assert_eq!(pos, $pos),
_ => unreachable!(),
}
}
};
}
#[cfg(test)]
pub mod tests {
use super::super::error::Error;
use super::super::iterators::Pairs;
use super::super::{state, Parser};
#[allow(non_camel_case_types)]
#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub enum Rule {
a,
b,
c,
}
pub struct AbcParser;
impl Parser for AbcParser {
fn parse<'i>(_: Rule, input: &'i str) -> Result, Error> {
state(input, |state| {
state
.rule(Rule::a, |s| {
s.skip(1)
.unwrap()
.rule(Rule::b, |s| s.skip(1))
.unwrap()
.skip(1)
})
.and_then(|s| s.skip(1).unwrap().rule(Rule::c, |s| s.match_string("e")))
})
}
}
#[test]
fn parses_to() {
parses_to! {
parser: AbcParser,
input: "abcde",
rule: Rule::a,
tokens: [
a(0, 3, [
b(1, 2),
]),
c(4, 5)
]
};
}
#[test]
#[should_panic]
fn missing_end() {
parses_to! {
parser: AbcParser,
input: "abcde",
rule: Rule::a,
tokens: [
a(0, 3, [
b(1, 2)
])
]
};
}
#[test]
#[should_panic]
fn empty() {
parses_to! {
parser: AbcParser,
input: "abcde",
rule: Rule::a,
tokens: []
};
}
#[test]
fn fails_with() {
fails_with! {
parser: AbcParser,
input: "abcdf",
rule: Rule::a,
positives: vec![Rule::c],
negatives: vec![],
pos: 4
};
}
#[test]
#[should_panic]
fn wrong_positives() {
fails_with! {
parser: AbcParser,
input: "abcdf",
rule: Rule::a,
positives: vec![Rule::a],
negatives: vec![],
pos: 4
};
}
#[test]
#[should_panic]
fn wrong_negatives() {
fails_with! {
parser: AbcParser,
input: "abcdf",
rule: Rule::a,
positives: vec![Rule::c],
negatives: vec![Rule::c],
pos: 4
};
}
#[test]
#[should_panic]
fn wrong_pos() {
fails_with! {
parser: AbcParser,
input: "abcdf",
rule: Rule::a,
positives: vec![Rule::c],
negatives: vec![],
pos: 3
};
}
}
pest-2.1.2/src/parser.rs 0100644 0000765 0000024 00000001226 13407230676 0013350 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
use error::Error;
use iterators::Pairs;
use RuleType;
/// A trait with a single method that parses strings.
pub trait Parser {
/// Parses a `&str` starting from `rule`.
fn parse(rule: R, input: &str) -> Result, Error>;
}
pest-2.1.2/src/parser_state.rs 0100644 0000765 0000024 00000111222 13533150510 0014532 0 ustar 00 0000000 0000000 // pest. The Elegant Parser
// Copyright (c) 2018 Dragoș Tiselice
//
// Licensed under the Apache License, Version 2.0
// or the MIT
// license , at your
// option. All files in the project carrying such notice may not be copied,
// modified, or distributed except according to those terms.
use std::ops::Range;
use std::rc::Rc;
use error::{Error, ErrorVariant};
use iterators::{pairs, QueueableToken};
use position::{self, Position};
use span::Span;
use stack::Stack;
use RuleType;
/// The current lookahead status of a [`ParserState`].
///
/// [`ParserState`]: struct.ParserState.html
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum Lookahead {
Positive,
Negative,
None,
}
/// The current atomicity of a [`ParserState`].
///
/// [`ParserState`]: struct.ParserState.html
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum Atomicity {
Atomic,
CompoundAtomic,
NonAtomic,
}
/// Type alias to simplify specifying the return value of chained closures.
pub type ParseResult = Result;
/// Match direction for the stack. Used in `PEEK[a..b]`/`stack_match_peek_slice`.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum MatchDir {
BottomToTop,
TopToBottom,
}
/// The complete state of a [`Parser`].
///
/// [`Parser`]: trait.Parser.html
#[derive(Debug)]
pub struct ParserState<'i, R: RuleType> {
position: Position<'i>,
queue: Vec>,
lookahead: Lookahead,
pos_attempts: Vec,
neg_attempts: Vec,
attempt_pos: usize,
atomicity: Atomicity,
stack: Stack>,
}
/// Creates a `ParserState` from a `&str`, supplying it to a closure `f`.
///
/// # Examples
///
/// ```
/// # use pest;
/// let input = "";
/// pest::state::<(), _>(input, |s| Ok(s)).unwrap();
/// ```
pub fn state<'i, R: RuleType, F>(input: &'i str, f: F) -> Result, Error>
where
F: FnOnce(Box>) -> ParseResult>>,
{
let state = ParserState::new(input);
match f(state) {
Ok(state) => {
let len = state.queue.len();
Ok(pairs::new(Rc::new(state.queue), input, 0, len))
}
Err(mut state) => {
state.pos_attempts.sort();
state.pos_attempts.dedup();
state.neg_attempts.sort();
state.neg_attempts.dedup();
Err(Error::new_from_pos(
ErrorVariant::ParsingError {
positives: state.pos_attempts.clone(),
negatives: state.neg_attempts.clone(),
},
// TODO(performance): Guarantee state.attempt_pos is a valid position
position::Position::new(input, state.attempt_pos).unwrap(),
))
}
}
}
impl<'i, R: RuleType> ParserState<'i, R> {
/// Allocates a fresh `ParserState` object to the heap and returns the owned `Box`. This `Box`
/// will be passed from closure to closure based on the needs of the specified `Parser`.
///
/// # Examples
///
/// ```
/// # use pest;
/// let input = "";
/// let state: Box> = pest::ParserState::new(input);
/// ```
#[allow(clippy::new_ret_no_self)]
pub fn new(input: &'i str) -> Box {
Box::new(ParserState {
position: Position::from_start(input),
queue: vec![],
lookahead: Lookahead::None,
pos_attempts: vec![],
neg_attempts: vec![],
attempt_pos: 0,
atomicity: Atomicity::NonAtomic,
stack: Stack::new(),
})
}
/// Returns a reference to the current `Position` of the `ParserState`.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// ab
/// }
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let position = state.position();
/// assert_eq!(position.pos(), 0);
/// ```
pub fn position(&self) -> &Position<'i> {
&self.position
}
/// Returns the current atomicity of the `ParserState`.
///
/// # Examples
///
/// ```
/// # use pest;
/// # use pest::Atomicity;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// ab
/// }
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let atomicity = state.atomicity();
/// assert_eq!(atomicity, Atomicity::NonAtomic);
/// ```
pub fn atomicity(&self) -> Atomicity {
self.atomicity
}
/// Wrapper needed to generate tokens. This will associate the `R` type rule to the closure
/// meant to match the rule.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a
/// }
///
/// let input = "a";
/// let pairs: Vec<_> = pest::state(input, |state| {
/// state.rule(Rule::a, |s| Ok(s))
/// }).unwrap().collect();
///
/// assert_eq!(pairs.len(), 1);
/// ```
#[inline]
pub fn rule(mut self: Box, rule: R, f: F) -> ParseResult>
where
F: FnOnce(Box) -> ParseResult>,
{
let actual_pos = self.position.pos();
let index = self.queue.len();
let (pos_attempts_index, neg_attempts_index) = if actual_pos == self.attempt_pos {
(self.pos_attempts.len(), self.neg_attempts.len())
} else {
// Attempts have not been cleared yet since the attempt_pos is older.
(0, 0)
};
if self.lookahead == Lookahead::None && self.atomicity != Atomicity::Atomic {
// Pair's position will only be known after running the closure.
self.queue.push(QueueableToken::Start {
end_token_index: 0,
input_pos: actual_pos,
});
}
let attempts = self.attempts_at(actual_pos);
let result = f(self);
match result {
Ok(mut new_state) => {
if new_state.lookahead == Lookahead::Negative {
new_state.track(
rule,
actual_pos,
pos_attempts_index,
neg_attempts_index,
attempts,
);
}
if new_state.lookahead == Lookahead::None
&& new_state.atomicity != Atomicity::Atomic
{
// Storing the pair's index in the first token that was added before the closure was
// run.
let new_index = new_state.queue.len();
match new_state.queue[index] {
QueueableToken::Start {
ref mut end_token_index,
..
} => *end_token_index = new_index,
_ => unreachable!(),
};
let new_pos = new_state.position.pos();
new_state.queue.push(QueueableToken::End {
start_token_index: index,
rule,
input_pos: new_pos,
});
}
Ok(new_state)
}
Err(mut new_state) => {
if new_state.lookahead != Lookahead::Negative {
new_state.track(
rule,
actual_pos,
pos_attempts_index,
neg_attempts_index,
attempts,
);
}
if new_state.lookahead == Lookahead::None
&& new_state.atomicity != Atomicity::Atomic
{
new_state.queue.truncate(index);
}
Err(new_state)
}
}
}
fn attempts_at(&self, pos: usize) -> usize {
if self.attempt_pos == pos {
self.pos_attempts.len() + self.neg_attempts.len()
} else {
0
}
}
fn track(
&mut self,
rule: R,
pos: usize,
pos_attempts_index: usize,
neg_attempts_index: usize,
prev_attempts: usize,
) {
if self.atomicity == Atomicity::Atomic {
return;
}
// If nested rules made no progress, there is no use to report them; it's only useful to
// track the current rule, the exception being when only one attempt has been made during
// the children rules.
let curr_attempts = self.attempts_at(pos);
if curr_attempts > prev_attempts && curr_attempts - prev_attempts == 1 {
return;
}
if pos == self.attempt_pos {
self.pos_attempts.truncate(pos_attempts_index);
self.neg_attempts.truncate(neg_attempts_index);
}
if pos > self.attempt_pos {
self.pos_attempts.clear();
self.neg_attempts.clear();
self.attempt_pos = pos;
}
let attempts = if self.lookahead != Lookahead::Negative {
&mut self.pos_attempts
} else {
&mut self.neg_attempts
};
if pos == self.attempt_pos {
attempts.push(rule);
}
}
/// Starts a sequence of transformations provided by `f` from the `Box`. Returns
/// the same `Result` returned by `f` in the case of an `Ok`, or `Err` with the current
/// `Box` otherwise.
///
/// This method is useful to parse sequences that only match together which usually come in the
/// form of chained `Result`s with
/// [`Result::and_then`](https://doc.rust-lang.org/std/result/enum.Result.html#method.and_then).
///
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a
/// }
///
/// let input = "a";
/// let pairs: Vec<_> = pest::state(input, |state| {
/// state.sequence(|s| {
/// s.rule(Rule::a, |s| Ok(s)).and_then(|s| {
/// s.match_string("b")
/// })
/// }).or_else(|s| {
/// Ok(s)
/// })
/// }).unwrap().collect();
///
/// assert_eq!(pairs.len(), 0);
/// ```
#[inline]
pub fn sequence(self: Box, f: F) -> ParseResult>
where
F: FnOnce(Box) -> ParseResult>,
{
let token_index = self.queue.len();
let initial_pos = self.position.clone();
let result = f(self);
match result {
Ok(new_state) => Ok(new_state),
Err(mut new_state) => {
// Restore the initial position and truncate the token queue.
new_state.position = initial_pos;
new_state.queue.truncate(token_index);
Err(new_state)
}
}
}
/// Repeatedly applies the transformation provided by `f` from the `Box`. Returns
/// `Ok` with the updated `Box` returned by `f` wrapped up in an `Err`.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// ab
/// }
///
/// let input = "aab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.repeat(|s| {
/// s.match_string("a")
/// });
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 2);
///
/// state = pest::ParserState::new(input);
/// result = state.repeat(|s| {
/// s.match_string("b")
/// });
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 0);
/// ```
#[inline]
pub fn repeat(self: Box, mut f: F) -> ParseResult>
where
F: FnMut(Box) -> ParseResult>,
{
let mut result = f(self);
loop {
match result {
Ok(state) => result = f(state),
Err(state) => return Ok(state),
};
}
}
/// Optionally applies the transformation provided by `f` from the `Box`. Returns
/// `Ok` with the updated `Box` returned by `f` regardless of the `Result`.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// ab
/// }
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let result = state.optional(|s| {
/// s.match_string("ab")
/// });
/// assert!(result.is_ok());
///
/// state = pest::ParserState::new(input);
/// let result = state.optional(|s| {
/// s.match_string("ac")
/// });
/// assert!(result.is_ok());
/// ```
#[inline]
pub fn optional(self: Box, f: F) -> ParseResult>
where
F: FnOnce(Box) -> ParseResult>,
{
match f(self) {
Ok(state) | Err(state) => Ok(state),
}
}
/// Attempts to match a single character based on a filter function. Returns `Ok` with the
/// updated `Box` if successful, or `Err` with the updated `Box`
/// otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let result = state.match_char_by(|c| c.is_ascii());
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 1);
///
/// let input = "❤";
/// let mut state: Box> = pest::ParserState::new(input);
/// let result = state.match_char_by(|c| c.is_ascii());
/// assert!(result.is_err());
/// assert_eq!(result.unwrap_err().position().pos(), 0);
/// ```
#[inline]
pub fn match_char_by(mut self: Box, f: F) -> ParseResult>
where
F: FnOnce(char) -> bool,
{
if self.position.match_char_by(f) {
Ok(self)
} else {
Err(self)
}
}
/// Attempts to match the given string. Returns `Ok` with the updated `Box` if
/// successful, or `Err` with the updated `Box` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.match_string("ab");
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 2);
///
/// state = pest::ParserState::new(input);
/// result = state.match_string("ac");
/// assert!(result.is_err());
/// assert_eq!(result.unwrap_err().position().pos(), 0);
/// ```
#[inline]
pub fn match_string(mut self: Box, string: &str) -> ParseResult> {
if self.position.match_string(string) {
Ok(self)
} else {
Err(self)
}
}
/// Attempts to case-insensitively match the given string. Returns `Ok` with the updated
/// `Box` if successful, or `Err` with the updated `Box` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.match_insensitive("AB");
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 2);
///
/// state = pest::ParserState::new(input);
/// result = state.match_insensitive("AC");
/// assert!(result.is_err());
/// assert_eq!(result.unwrap_err().position().pos(), 0);
/// ```
#[inline]
pub fn match_insensitive(mut self: Box, string: &str) -> ParseResult> {
if self.position.match_insensitive(string) {
Ok(self)
} else {
Err(self)
}
}
/// Attempts to match a single character from the given range. Returns `Ok` with the updated
/// `Box` if successful, or `Err` with the updated `Box` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.match_range('a'..'z');
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 1);
///
/// state = pest::ParserState::new(input);
/// result = state.match_range('A'..'Z');
/// assert!(result.is_err());
/// assert_eq!(result.unwrap_err().position().pos(), 0);
/// ```
#[inline]
pub fn match_range(mut self: Box, range: Range) -> ParseResult> {
if self.position.match_range(range) {
Ok(self)
} else {
Err(self)
}
}
/// Attempts to skip `n` characters forward. Returns `Ok` with the updated `Box`
/// if successful, or `Err` with the updated `Box` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.skip(1);
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 1);
///
/// state = pest::ParserState::new(input);
/// result = state.skip(3);
/// assert!(result.is_err());
/// assert_eq!(result.unwrap_err().position().pos(), 0);
/// ```
#[inline]
pub fn skip(mut self: Box, n: usize) -> ParseResult> {
if self.position.skip(n) {
Ok(self)
} else {
Err(self)
}
}
/// Attempts to skip forward until one of the given strings is found. Returns `Ok` with the
/// updated `Box` whether or not one of the strings is found.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "abcd";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.skip_until(&["c", "d"]);
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 2);
/// ```
#[inline]
pub fn skip_until(mut self: Box, strings: &[&str]) -> ParseResult> {
self.position.skip_until(strings);
Ok(self)
}
/// Attempts to match the start of the input. Returns `Ok` with the current `Box`
/// if the parser has not yet advanced, or `Err` with the current `Box` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.start_of_input();
/// assert!(result.is_ok());
///
/// state = pest::ParserState::new(input);
/// state = state.match_string("ab").unwrap();
/// result = state.start_of_input();
/// assert!(result.is_err());
/// ```
#[inline]
pub fn start_of_input(self: Box) -> ParseResult> {
if self.position.at_start() {
Ok(self)
} else {
Err(self)
}
}
/// Attempts to match the end of the input. Returns `Ok` with the current `Box` if
/// there is no input remaining, or `Err` with the current `Box` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.end_of_input();
/// assert!(result.is_err());
///
/// state = pest::ParserState::new(input);
/// state = state.match_string("ab").unwrap();
/// result = state.end_of_input();
/// assert!(result.is_ok());
/// ```
#[inline]
pub fn end_of_input(self: Box) -> ParseResult> {
if self.position.at_end() {
Ok(self)
} else {
Err(self)
}
}
/// Starts a lookahead transformation provided by `f` from the `Box`. It returns
/// `Ok` with the current `Box` if `f` also returns an `Ok`, or `Err` with the current
/// `Box` otherwise. If `is_positive` is `false`, it swaps the `Ok` and `Err`
/// together, negating the `Result`.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a
/// }
///
/// let input = "a";
/// let pairs: Vec<_> = pest::state(input, |state| {
/// state.lookahead(true, |state| {
/// state.rule(Rule::a, |s| Ok(s))
/// })
/// }).unwrap().collect();
///
/// assert_eq!(pairs.len(), 0);
/// ```
#[inline]
pub fn lookahead(mut self: Box, is_positive: bool, f: F) -> ParseResult>
where
F: FnOnce(Box) -> ParseResult>,
{
let initial_lookahead = self.lookahead;
self.lookahead = if is_positive {
match initial_lookahead {
Lookahead::None | Lookahead::Positive => Lookahead::Positive,
Lookahead::Negative => Lookahead::Negative,
}
} else {
match initial_lookahead {
Lookahead::None | Lookahead::Positive => Lookahead::Negative,
Lookahead::Negative => Lookahead::Positive,
}
};
let initial_pos = self.position.clone();
let result = f(self.checkpoint());
let result_state = match result {
Ok(mut new_state) => {
new_state.position = initial_pos;
new_state.lookahead = initial_lookahead;
Ok(new_state.restore())
}
Err(mut new_state) => {
new_state.position = initial_pos;
new_state.lookahead = initial_lookahead;
Err(new_state.restore())
}
};
if is_positive {
result_state
} else {
match result_state {
Ok(state) => Err(state),
Err(state) => Ok(state),
}
}
}
/// Transformation which stops `Token`s from being generated according to `is_atomic`.
///
/// # Examples
///
/// ```
/// # use pest::{self, Atomicity};
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {
/// a
/// }
///
/// let input = "a";
/// let pairs: Vec<_> = pest::state(input, |state| {
/// state.atomic(Atomicity::Atomic, |s| {
/// s.rule(Rule::a, |s| Ok(s))
/// })
/// }).unwrap().collect();
///
/// assert_eq!(pairs.len(), 0);
/// ```
#[inline]
pub fn atomic(mut self: Box, atomicity: Atomicity, f: F) -> ParseResult>
where
F: FnOnce(Box) -> ParseResult>,
{
let initial_atomicity = self.atomicity;
let should_toggle = self.atomicity != atomicity;
if should_toggle {
self.atomicity = atomicity;
}
let result = f(self);
match result {
Ok(mut new_state) => {
if should_toggle {
new_state.atomicity = initial_atomicity;
}
Ok(new_state)
}
Err(mut new_state) => {
if should_toggle {
new_state.atomicity = initial_atomicity;
}
Err(new_state)
}
}
}
/// Evaluates the result of closure `f` and pushes the span of the input consumed from before
/// `f` is called to after `f` is called to the stack. Returns `Ok(Box)` if `f` is
/// called successfully, or `Err(Box)` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "ab";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.stack_push(|state| state.match_string("a"));
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 1);
/// ```
#[inline]
pub fn stack_push(self: Box, f: F) -> ParseResult>
where
F: FnOnce(Box) -> ParseResult>,
{
let start = self.position.clone();
let result = f(self);
match result {
Ok(mut state) => {
let end = state.position.clone();
state.stack.push(start.span(&end));
Ok(state)
}
Err(state) => Err(state),
}
}
/// Peeks the top of the stack and attempts to match the string. Returns `Ok(Box)`
/// if the string is matched successfully, or `Err(Box)` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "aa";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.stack_push(|state| state.match_string("a")).and_then(
/// |state| state.stack_peek()
/// );
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 2);
/// ```
#[inline]
pub fn stack_peek(self: Box) -> ParseResult> {
let string = self
.stack
.peek()
.expect("peek was called on empty stack")
.as_str();
self.match_string(string)
}
/// Pops the top of the stack and attempts to match the string. Returns `Ok(Box)`
/// if the string is matched successfully, or `Err(Box)` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "aa";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.stack_push(|state| state.match_string("a")).and_then(
/// |state| state.stack_pop()
/// );
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 2);
/// ```
#[inline]
pub fn stack_pop(mut self: Box) -> ParseResult> {
let string = self
.stack
.pop()
.expect("pop was called on empty stack")
.as_str();
self.match_string(string)
}
/// Matches part of the state of the stack.
///
/// # Examples
///
/// ```
/// # use pest::{self, MatchDir};
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "abcd cd cb";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state
/// .stack_push(|state| state.match_string("a"))
/// .and_then(|state| state.stack_push(|state| state.match_string("b")))
/// .and_then(|state| state.stack_push(|state| state.match_string("c")))
/// .and_then(|state| state.stack_push(|state| state.match_string("d")))
/// .and_then(|state| state.match_string(" "))
/// .and_then(|state| state.stack_match_peek_slice(2, None, MatchDir::BottomToTop))
/// .and_then(|state| state.match_string(" "))
/// .and_then(|state| state.stack_match_peek_slice(1, Some(-1), MatchDir::TopToBottom));
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 10);
/// ```
#[inline]
pub fn stack_match_peek_slice(
mut self: Box,
start: i32,
end: Option,
match_dir: MatchDir,
) -> ParseResult> {
let range = match constrain_idxs(start, end, self.stack.len()) {
Some(r) => r,
None => return Err(self),
};
// return true if an empty sequence is requested
if range.end <= range.start {
return Ok(self);
}
let mut position = self.position.clone();
let result = {
let mut iter_b2t = self.stack[range].iter();
let matcher = |span: &Span| position.match_string(span.as_str());
match match_dir {
MatchDir::BottomToTop => iter_b2t.all(matcher),
MatchDir::TopToBottom => iter_b2t.rev().all(matcher),
}
};
if result {
self.position = position;
Ok(self)
} else {
Err(self)
}
}
/// Matches the full state of the stack.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "abba";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state
/// .stack_push(|state| state.match_string("a"))
/// .and_then(|state| { state.stack_push(|state| state.match_string("b")) })
/// .and_then(|state| state.stack_match_peek());
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 4);
/// ```
#[inline]
pub fn stack_match_peek(self: Box) -> ParseResult> {
self.stack_match_peek_slice(0, None, MatchDir::TopToBottom)
}
/// Matches the full state of the stack. This method will clear the stack as it evaluates.
///
/// # Examples
///
/// ```
/// /// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "aaaa";
/// let mut state: Box> = pest::ParserState::new(input);
/// let mut result = state.stack_push(|state| state.match_string("a")).and_then(|state| {
/// state.stack_push(|state| state.match_string("a"))
/// }).and_then(|state| state.stack_match_peek());
/// assert!(result.is_ok());
/// assert_eq!(result.unwrap().position().pos(), 4);
/// ```
#[inline]
pub fn stack_match_pop(mut self: Box) -> ParseResult> {
let mut position = self.position.clone();
let mut result = true;
while let Some(span) = self.stack.pop() {
result = position.match_string(span.as_str());
if !result {
break;
}
}
if result {
self.position = position;
Ok(self)
} else {
Err(self)
}
}
/// Drops the top of the stack. Returns `Ok(Box)` if there was a value to drop, or
/// `Err(Box)` otherwise.
///
/// # Examples
///
/// ```
/// # use pest;
/// # #[allow(non_camel_case_types)]
/// # #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
/// enum Rule {}
///
/// let input = "aa";
/// let mut state: Box