pax_global_header00006660000000000000000000000064144711617510014521gustar00rootroot0000000000000052 comment=6dde6d4b900d0982b6c66b59d35c0a44f788ea34 nom_locate-4.2.0/000077500000000000000000000000001447116175100136445ustar00rootroot00000000000000nom_locate-4.2.0/.github/000077500000000000000000000000001447116175100152045ustar00rootroot00000000000000nom_locate-4.2.0/.github/workflows/000077500000000000000000000000001447116175100172415ustar00rootroot00000000000000nom_locate-4.2.0/.github/workflows/ci.yml000066400000000000000000000027431447116175100203650ustar00rootroot00000000000000name: CI on: [push, pull_request] env: RUST_MINVERSION: 1.48.0 jobs: test: name: Test runs-on: ubuntu-latest strategy: matrix: rust: - stable - beta - nightly - 1.48.0 features: - '' - '--features alloc --no-default-features' - '--no-default-features' steps: - name: Checkout sources uses: actions/checkout@v2 - name: Install rust (${{ matrix.rust }}) uses: actions-rs/toolchain@v1 with: toolchain: ${{ matrix.rust }} profile: minimal override: true - name: Build uses: actions-rs/cargo@v1 with: command: build args: --verbose ${{ matrix.features }} - name: Test uses: actions-rs/cargo@v1 with: command: test args: --verbose ${{ matrix.features }} - name: Doc uses: actions-rs/cargo@v1 with: command: doc args: --verbose ${{ matrix.features }} fmt: name: Check formatting runs-on: ubuntu-latest steps: - name: Checkout sources uses: actions/checkout@v2 - name: Install rust uses: actions-rs/toolchain@v1 with: toolchain: stable components: rustfmt profile: minimal override: true - name: cargo fmt -- --check uses: actions-rs/cargo@v1 with: command: fmt args: -- --check nom_locate-4.2.0/.gitignore000066400000000000000000000000361447116175100156330ustar00rootroot00000000000000target/ **/*.rs.bk Cargo.lock nom_locate-4.2.0/CHANGELOG.md000066400000000000000000000115521447116175100154610ustar00rootroot00000000000000# CHANGELOG ## v4.2.0 Improvements: * [Add methods to take ownership of fragment and extra data](https://github.com/fflorent/nom_locate/pull/91) Internal: * Remove build status from README * Fix compilation warning in example ## v4.1.0 Improvements: * [Remove unneeded bounds & add map method](https://github.com/fflorent/nom_locate/pull/83) * [Implement AsRef for LocatedSpan](https://github.com/fflorent/nom_locate/pull/85) Documentation fix: * [Use `new_extra` instead of `new` for `LocatedSpan` with extra data](https://github.com/fflorent/nom_locate/pull/84) ## v4.0.0 Breaking change: * [Update to nom 7](https://github.com/fflorent/nom_locate/pull/78) ## v3.1.0 This is likely the last 3.x.x release, and 4.0.0 will use nom 7 instead of nom 6. Improvements: * [Genericizes the rest of the nom traits](https://github.com/fflorent/nom_locate/pull/76) Documentation fix: * [Fix link to docs of LocatedSpan in README](https://github.com/fflorent/nom_locate/pull/77) ## v3.0.2 Fixes: * [Generalize FindSubstring impl](https://github.com/fflorent/nom_locate/pull/72) to types other than &'a str * [no_std support](https://github.com/fflorent/nom_locate/pull/61) Other changes: * Switched CI from Travis to Github Actions * Always run 'cargo fmt' on the CI ## v3.0.1 Fix: * [Skip test it_should_ignore_extra_for_hash on no_std](https://github.com/fflorent/nom_locate/commit/42046bc1765d45dac00e2d6dd3bbd07b997946f1) Documentation fixes/updates: * [README.md: Update example code block from the documentation](https://github.com/fflorent/nom_locate/commit/5775fe3c5203ca082e8e61049eac78195e3c2386) * [Fix erroneous backticks in documentation + Update documentation from README and nom](https://github.com/fflorent/nom_locate/pull/63) ## v3.0.0 Breaking change: * [Update to nom 6](https://github.com/fflorent/nom_locate/pull/67) Other change: * [Implement Hash if members impl Hash](https://github.com/fflorent/nom_locate/pull/69) ## v2.1.0 This release mostly brings some new trait implementations for convenience. * [Change tests text for copyright reasons](https://github.com/fflorent/nom_locate/pull/56) * [Implement `From` for `LocatedSpan`](https://github.com/fflorent/nom_locate/pull/57) * [Implement `Deref` for `LocatedSpan`, returning the fragment](https://github.com/fflorent/nom_locate/pull/58) * [Optionally implement `StableDeref` as well](https://github.com/fflorent/nom_locate/pull/65), if the `stable-deref-trait` feature is enabled. * [Generalize `Compare`](https://github.com/fflorent/nom_locate/pull/58) * [Generalize `HexDisplay`, and deprecated the `impl_hex_display!` macro which no longer does anything](https://github.com/fflorent/nom_locate/pull/58) * [Add `LocatedSpan::get_line_beginning`](https://github.com/fflorent/nom_locate/pull/66), which returns the beginning of a line up to the end of the LocatedSpan. Useful to display human-friendly errors. ## v2.0.0 This release brings several breaking changes: * [Error type for "position" is made generic](https://github.com/fflorent/nom_locate/pull/37) * [`extra` property is now ignored when comparing LocatedSpan](https://github.com/fflorent/nom_locate/pull/46) * [Dependency on nom now uses with `default-features = false`](https://github.com/fflorent/nom_locate/pull/47) * [`offset`/`line`/`fragment` are now private attributes of the `LocatedSpan` structure](https://github.com/fflorent/nom_locate/pull/50), to fix an undefined behavior is they are modified. You now have to use the `location_offset()`, `location_line()`, and `fragment()` getters instead. * [`LocatedSpanEx` is removed in favour of adding a generic type parameter to `LocatedSpan` which defaults to to `()`](https://github.com/fflorent/nom_locate/pull/51) Additionally, there are a few documentation improvements: * LocatedSpan should not be constructed in the middle of a parser. * Fix typo in extra property docs for LocatedSpan Finally, [`LocatedSpan` now implements `Display`](https://github.com/fflorent/nom_locate/pull/40) ## v1.0.0 We decided that the crate was mature enough to release the version 1.0.0. It doesn't bring much new things, still we are proud of this big move! :tada: - [Implement AsByte](https://github.com/fflorent/nom_locate/pull/33) ## v0.4.0 - [Support for Nom v5](https://github.com/fflorent/nom_locate/pull/23) - [Add support for extra information to LocatedSpan](https://github.com/fflorent/nom_locate/pull/28) Thanks to the people who made this release: @ProgVal, @peckpeck, @wycats, @dalance ## v0.3.1 Patch version: - [Support no_std](https://github.com/fflorent/nom_locate/pull/16) - [Fix compilation with verbose-errors](https://github.com/fflorent/nom_locate/issues/17) ## v0.3 - [Support for Nom v4](https://github.com/fflorent/nom_locate/pull/10) - [Better performance for columns calculation](https://github.com/fflorent/nom_locate/issues/4) - [Speed up slices](https://github.com/fflorent/nom_locate/pull/15) nom_locate-4.2.0/Cargo.toml000066400000000000000000000020371447116175100155760ustar00rootroot00000000000000[package] authors = ["Florent FAYOLLE ", "Christopher Durham ", "Valentin Lorentz "] categories = ["parsing"] description = "A special input type for nom to locate tokens" documentation = "https://docs.rs/nom_locate/" homepage = "https://github.com/fflorent/nom_locate" keywords = ["nom"] license = "MIT" name = "nom_locate" readme = "README.md" repository = "https://github.com/fflorent/nom_locate" version = "4.2.0" edition = "2018" [badges.travis-ci] repository = "fflorent/nom_locate" [features] default = ["std"] std = ["nom/std", "alloc", "memchr/use_std"] alloc = ["nom/alloc"] generic-simd = ["bytecount/generic-simd"] runtime-dispatch-simd = ["bytecount/runtime-dispatch-simd"] stable-deref-trait = ["stable_deref_trait"] [dependencies] bytecount = "^0.6" memchr = { version = ">=1.0.1, <3.0.0", default-features = false } # ^1.0.0 + ^2.0 nom = { version = "7", default-features = false } stable_deref_trait = { version = "^1", optional = true, default-features = false } nom_locate-4.2.0/FAQ.md000066400000000000000000000015701447116175100146000ustar00rootroot00000000000000# FAQ ## How to use LocatedSpan with my own input type? LocatedSpan has been designed to wrap any input type. By default it wraps `&str` and `&[u8]` but it should work with any other types. To do so, all you need is to ensure that your input type implements these traits: - `nom::InputLength` - `nom::Slice` - `nom::InputIter` - `nom::Compare` - `nom::Offset` - `nom::CompareResult` - `nom::FindSubstring` - `nom::ParseTo` - `nom::AsBytes` And ensure that what represents a char in your input type implements `nom::FindToken`. Then you may use all the `impl_*` macros exposed by the library (see the [crate documentation](https://docs.rs/nom_locate/)). ## `get_column` is not accurate Your input probably doesn't have ASCII characters only. You'd probably better use `get_column_utf8` when your input is contains UTF-8 extensions, having in mind that it is much slower. nom_locate-4.2.0/LICENSE000066400000000000000000000020661447116175100146550ustar00rootroot00000000000000Copyright 2017-2019 Florent Fayolle, Valentin Lorentz Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. nom_locate-4.2.0/README.md000066400000000000000000000062431447116175100151300ustar00rootroot00000000000000# nom_locate [![Coverage Status](https://coveralls.io/repos/github/fflorent/nom_locate/badge.svg?branch=master)](https://coveralls.io/github/fflorent/nom_locate?branch=master) [![](https://img.shields.io/crates/v/nom_locate.svg)](https://crates.io/crates/nom_locate) A special input type for [nom](https://github.com/geal/nom) to locate tokens ## Documentation The documentation of the crate is available [here](https://docs.rs/nom_locate/). ## How to use it The crate provide the [`LocatedSpan` struct](https://docs.rs/nom_locate/latest/nom_locate/struct.LocatedSpan.html) that encapsulates the data. Look at the below example and the explanations: ````rust #[macro_use] extern crate nom; #[macro_use] extern crate nom_locate; use nom_locate::LocatedSpan; type Span<'a> = LocatedSpan<&'a str>; struct Token<'a> { pub position: Span<'a>, pub foo: String, pub bar: String, } named!(parse_foobar( Span ) -> Token, do_parse!( take_until!("foo") >> position: position!() >> foo: tag!("foo") >> bar: tag!("bar") >> (Token { position: position, foo: foo.to_string(), bar: bar.to_string() }) )); fn main () { let input = Span::new("Lorem ipsum \n foobar"); let output = parse_foobar(input); let position = output.unwrap().1.position; assert_eq!(position.location_offset(), 14); assert_eq!(position.location_line(), 2); assert_eq!(position.fragment(), &""); assert_eq!(position.get_column(), 2); } ```` ### Import Import [nom](https://github.com/geal/nom) and nom_locate. ````rust extern crate nom; extern crate nom_locate; use nom::bytes::complete::{tag, take_until}; use nom::IResult; use nom_locate::{position, LocatedSpan}; ```` Also you'd probably create [type alias](https://doc.rust-lang.org/book/type-aliases.html) for convenience so you don't have to specify the `fragment` type every time: ````rust type Span<'a> = LocatedSpan<&'a str>; ```` ### Define the output structure The output structure of your parser may contain the position as a `Span` (which provides the `index`, `line` and `column` information to locate your token). ````rust struct Token<'a> { pub position: Span<'a>, pub foo: &'a str, pub bar: &'a str, } ```` ### Create the parser The parser has to accept a `Span` as an input. You may use `position()` in your nom parser, in order to capture the location of your token: ````rust fn parse_foobar(s: Span) -> IResult { let (s, _) = take_until("foo")(s)?; let (s, pos) = position(s)?; let (s, foo) = tag("foo")(s)?; let (s, bar) = tag("bar")(s)?; Ok(( s, Token { position: pos, foo: foo.fragment, bar: bar.fragment, }, )) } ```` ### Call the parser The parser returns a `nom::IResult` (hence the `unwrap().1`). The `position` property contains the `offset`, `line` and `column`. ````rust fn main () { let input = Span::new("Lorem ipsum \n foobar"); let output = parse_foobar(input); let position = output.unwrap().1.position; assert_eq!(position, Span { offset: 14, line: 2, fragment: "" }); assert_eq!(position.get_column(), 2); } ```` nom_locate-4.2.0/benches/000077500000000000000000000000001447116175100152535ustar00rootroot00000000000000nom_locate-4.2.0/benches/benches.rs000066400000000000000000000120671447116175100172360ustar00rootroot00000000000000#![feature(test)] extern crate test; use nom::Slice; use nom_locate::LocatedSpan; use test::Bencher; // Pan Tadeusz. https://pl.m.wikisource.org/wiki/Pan_Tadeusz_(wyd._1834)/Ksi%C4%99ga_pierwsza const TEXT: &str = "Litwo! Ojczyzno moja! ty jesteś jak zdrowie; Ile cię trzeba cenić, ten tylko się dowie Kto cię stracił. Dziś piękność twą w całéj ozdobie Widzę i opisuję, bo tęsknię po tobie. Panno święta, co jasnéj bronisz Częstochowy I w Ostréj świecisz Bramie! Ty, co gród zamkowy Nowogródzki ochraniasz z jego wiernym ludem! Jak mnie dziecko do zdrowia powróciłaś cudem, (Gdy od płaczącéj matki, pod Twoję opiekę Ofiarowany, martwą podniosłem powiekę; I zaraz mogłem pieszo, do Twych świątyń progu Iść za wrócone życie podziękować Bogu;) Tak nas powrócisz cudem na Ojczyzny łono. Tymczasem przenoś moję duszę utęsknioną Do tych pagórków leśnych, do tych łąk zielonych, Szeroko nad błękitnym Niemnem rosciągnionych; Do tych pól malowanych zbożem rozmaitém, Wyzłacanych pszenicą, posrebrzanych żytem; Gdzie bursztynowy świerzop, gryka jak śnieg biała, Gdzie panieńskim rumieńcem dzięcielina pała, A wszystko przepasane jakby wstęgą, miedzą Zieloną, na niéj zrzadka ciche grusze siedzą. Sród takich pól przed laty, nad brzegiem ruczaju, Na pagórku niewielkim, we brzozowym gaju, Stał dwór szlachecki, z drzewa, lecz podmurowany; Świéciły się zdaleka pobielane ściany, Tém bielsze że odbite od ciemnéj zieleni Topoli, co go bronią od wiatrów jesieni. Dóm mieszkalny niewielki lecz zewsząd chędogi, I stodołę miał wielką i przy niéj trzy stogi Użątku, co pod strzechą zmieścić się niemoże; Widać że okolica obfita we zboże, I widać z liczby kopic, co wzdłuż i wszerz smugów Świecą gęsto jak gwiazdy; widać z liczby pługów Orzących wcześnie łany ogromne ugoru Czarnoziemne, zapewne należne do dworu, Uprawne dobrze nakształt ogrodowych grządek: Że w tym domu dostatek mieszka i porządek. Brama na wciąż otwarta przechodniom ogłasza, Że gościnna, i wszystkich w gościnę zaprasza."; const TEXT_ASCII: &str = "Litwo! Ojczyzno moja! ty jestes jak zdrowie; Ile cie trzeba cenic, ten tylko sie dowie Kto cie stracil. Dzis pieknosc twa w calej ozdobie Widze i opisuje, bo tesknie po tobie. Panno swieta, co jasnej bronisz Czestochowy I w Ostrej swiecisz Bramie![1] Ty, co grod zamkowy Nowogrodzki ochraniasz z jego wiernym ludem! Jak mnie dziecko do zdrowia powrocilas cudem, (Gdy od placzacej matki, pod Twoje opieke Ofiarowany, martwa podnioslem powieke; I zaraz moglem pieszo, do Twych swiatyn progu Isc za wrocone zycie podziekowac Bogu;) Tak nas powrocisz cudem na Ojczyzny lono. Tymczasem przenos moje dusze uteskniona Do tych pagorkow lesnych, do tych lak zielonych, Szeroko nad blekitnym Niemnem rosciagnionych; Do tych pol malowanych zbozem rozmaitem, Wyzlacanych pszenica, posrebrzanych zytem; Gdzie bursztynowy swierzop, gryka jak snieg biala, Gdzie panienskim rumiencem dziecielina pala, A wszystko przepasane jakby wstega, miedza Zielona, na niej zrzadka ciche grusze siedza. Srod takich pol przed laty, nad brzegiem ruczaju, Na pagorku niewielkim, we brzozowym gaju, Stal dwor szlachecki, z drzewa, lecz podmurowany; Swiecily sie zdaleka pobielane sciany, Tem bielsze ze odbite od ciemnej zieleni Topoli, co go bronia od wiatrow jesieni. Dom mieszkalny niewielki lecz zewszad chedogi, I stodole mial wielka i przy niej trzy stogi Uzatku, co pod strzecha zmiescic sie niemoze; Widac ze okolica obfita we zboze, I widac z liczby kopic, co wzdluz i wszerz smugow Swieca gesto jak gwiazdy; widac z liczby plugow Orzacych wczesnie lany ogromne ugoru Czarnoziemne, zapewne nalezne do dworu, Uprawne dobrze naksztalt ogrodowych grzadek: Ze w tym domu dostatek mieszka i porzadek. Brama na wciaz otwarta przechodniom oglasza, Ze goscinna, i wszystkich w goscine zaprasza."; #[bench] fn bench_slice_full(b: &mut Bencher) { let input = LocatedSpan::new(TEXT); b.iter(|| { input.slice(..); }); } #[bench] fn bench_slice_from(b: &mut Bencher) { let input = LocatedSpan::new(TEXT); b.iter(|| { input.slice(200..); }); } #[bench] fn bench_slice_from_zero(b: &mut Bencher) { let input = LocatedSpan::new(TEXT); b.iter(|| { input.slice(0..); }); } #[bench] fn bench_slice_to(b: &mut Bencher) { let input = LocatedSpan::new(TEXT); b.iter(|| { input.slice(..200); }); } #[bench] fn bench_slice(b: &mut Bencher) { let input = LocatedSpan::new(TEXT); b.iter(|| { input.slice(200..300); }); } #[bench] fn bench_slice_columns_only(b: &mut Bencher) { let text = TEXT.replace("\n", ""); let input = LocatedSpan::new(text.as_str()); b.iter(|| { input.slice(499..501).get_utf8_column(); }); } #[bench] fn bench_slice_columns_only_for_ascii_text(b: &mut Bencher) { #[allow(unused)] use std::ascii::AsciiExt; let text = TEXT_ASCII.replace("\n", ""); let input = LocatedSpan::new(text.as_str()); assert!(text.is_ascii()); b.iter(|| { input.slice(500..501).get_column(); }); } nom_locate-4.2.0/examples/000077500000000000000000000000001447116175100154625ustar00rootroot00000000000000nom_locate-4.2.0/examples/position.rs000066400000000000000000000017661447116175100177060ustar00rootroot00000000000000extern crate nom; extern crate nom_locate; use nom::bytes::complete::{tag, take_until}; use nom::IResult; use nom_locate::{position, LocatedSpan}; type Span<'a> = LocatedSpan<&'a str>; struct Token<'a> { pub position: Span<'a>, pub _foo: &'a str, pub _bar: &'a str, } fn parse_foobar(s: Span) -> IResult { let (s, _) = take_until("foo")(s)?; let (s, pos) = position(s)?; let (s, foo) = tag("foo")(s)?; let (s, bar) = tag("bar")(s)?; Ok(( s, Token { position: pos, _foo: foo.fragment(), _bar: bar.fragment(), }, )) } fn main() { let input = Span::new("Lorem ipsum \n foobar"); let output = parse_foobar(input); let position = output.unwrap().1.position; assert_eq!(position, unsafe { Span::new_from_raw_offset( 14, // offset 2, // line "", // fragment (), // extra ) }); assert_eq!(position.get_column(), 2); } nom_locate-4.2.0/src/000077500000000000000000000000001447116175100144335ustar00rootroot00000000000000nom_locate-4.2.0/src/lib.rs000066400000000000000000000621411447116175100155530ustar00rootroot00000000000000//! nom_locate, a special input type to locate tokens //! //! The source code is available on [Github](https://github.com/fflorent/nom_locate) //! //! ## Features //! //! This crate exposes two cargo feature flags, `generic-simd` and `runtime-dispatch-simd`. //! These correspond to the features exposed by [bytecount](https://github.com/llogiq/bytecount). //! //! ## How to use it //! The explanations are given in the [README](https://github.com/fflorent/nom_locate/blob/master/README.md) of the Github repository. You may also consult the [FAQ](https://github.com/fflorent/nom_locate/blob/master/FAQ.md). //! //! ``` //! use nom::bytes::complete::{tag, take_until}; //! use nom::IResult; //! use nom_locate::{position, LocatedSpan}; //! //! type Span<'a> = LocatedSpan<&'a str>; //! //! struct Token<'a> { //! pub position: Span<'a>, //! pub foo: &'a str, //! pub bar: &'a str, //! } //! //! fn parse_foobar(s: Span) -> IResult { //! let (s, _) = take_until("foo")(s)?; //! let (s, pos) = position(s)?; //! let (s, foo) = tag("foo")(s)?; //! let (s, bar) = tag("bar")(s)?; //! //! Ok(( //! s, //! Token { //! position: pos, //! foo: foo.fragment(), //! bar: bar.fragment(), //! }, //! )) //! } //! //! fn main () { //! let input = Span::new("Lorem ipsum \n foobar"); //! let output = parse_foobar(input); //! let position = output.unwrap().1.position; //! assert_eq!(position.location_offset(), 14); //! assert_eq!(position.location_line(), 2); //! assert_eq!(position.fragment(), &""); //! assert_eq!(position.get_column(), 2); //! } //! ``` //! //! ## Extra information //! //! You can also add arbitrary extra information using the extra property of `LocatedSpan`. //! This property is not used when comparing two `LocatedSpan`s. //! //! ```ignore //! use nom_locate::LocatedSpan; //! type Span<'a> = LocatedSpan<&'a str, String>; //! //! let input = Span::new_extra("Lorem ipsum \n foobar", "filename"); //! let output = parse_foobar(input); //! let extra = output.unwrap().1.extra; //! ``` #![cfg_attr(not(feature = "std"), no_std)] #[cfg(all(not(feature = "std"), feature = "alloc"))] #[cfg_attr(test, macro_use)] extern crate alloc; #[cfg(test)] mod tests; mod lib { #[cfg(feature = "std")] pub mod std { pub use std::fmt::{Display, Formatter, Result as FmtResult}; pub use std::hash::{Hash, Hasher}; pub use std::iter::{Copied, Enumerate}; pub use std::ops::{Range, RangeFrom, RangeFull, RangeTo}; pub use std::slice; pub use std::slice::Iter; pub use std::str::{CharIndices, Chars, FromStr}; pub use std::string::{String, ToString}; pub use std::vec::Vec; } #[cfg(not(feature = "std"))] pub mod std { #[cfg(feature = "alloc")] pub use alloc::fmt::{Display, Formatter, Result as FmtResult}; #[cfg(feature = "alloc")] pub use alloc::string::{String, ToString}; #[cfg(feature = "alloc")] pub use alloc::vec::Vec; pub use core::hash::{Hash, Hasher}; pub use core::iter::{Copied, Enumerate}; pub use core::ops::{Range, RangeFrom, RangeFull, RangeTo}; pub use core::slice; pub use core::slice::Iter; pub use core::str::{CharIndices, Chars, FromStr}; } } use lib::std::*; use bytecount::{naive_num_chars, num_chars}; use memchr::Memchr; #[cfg(feature = "alloc")] use nom::ExtendInto; use nom::{ error::{ErrorKind, ParseError}, AsBytes, Compare, CompareResult, Err, FindSubstring, FindToken, IResult, InputIter, InputLength, InputTake, InputTakeAtPosition, Offset, ParseTo, Slice, }; #[cfg(feature = "stable-deref-trait")] use stable_deref_trait::StableDeref; /// A LocatedSpan is a set of meta information about the location of a token, including extra /// information. /// /// The `LocatedSpan` structure can be used as an input of the nom parsers. /// It implements all the necessary traits for `LocatedSpan<&str,X>` and `LocatedSpan<&[u8],X>` #[derive(Debug, Clone, Copy)] pub struct LocatedSpan { /// The offset represents the position of the fragment relatively to /// the input of the parser. It starts at offset 0. offset: usize, /// The line number of the fragment relatively to the input of the /// parser. It starts at line 1. line: u32, /// The fragment that is spanned. /// The fragment represents a part of the input of the parser. fragment: T, /// Extra information that can be embedded by the user. /// Example: the parsed file name pub extra: X, } impl core::ops::Deref for LocatedSpan { type Target = T; fn deref(&self) -> &Self::Target { &self.fragment } } impl core::convert::AsRef for LocatedSpan<&T, X> where T: ?Sized + core::convert::AsRef, U: ?Sized, { fn as_ref(&self) -> &U { self.fragment.as_ref() } } #[cfg(feature = "stable-deref-trait")] /// Optionally impl StableDeref so that this type works harmoniously with other /// crates that rely on this marker trait, such as `rental` and `lazy_static`. /// LocatedSpan is largely just a wrapper around the contained type `T`, so /// this marker trait is safe to implement whenever T already implements /// StableDeref. unsafe impl StableDeref for LocatedSpan {} impl LocatedSpan { /// Create a span for a particular input with default `offset` and /// `line` values and empty extra data. /// You can compute the column through the `get_column` or `get_utf8_column` /// methods. /// /// `offset` starts at 0, `line` starts at 1, and `column` starts at 1. /// /// Do not use this constructor in parser functions; `nom` and /// `nom_locate` assume span offsets are relative to the beginning of the /// same input. In these cases, you probably want to use the /// `nom::traits::Slice` trait instead. /// /// # Example of use /// /// ``` /// # extern crate nom_locate; /// use nom_locate::LocatedSpan; /// /// # fn main() { /// let span = LocatedSpan::new(b"foobar"); /// /// assert_eq!(span.location_offset(), 0); /// assert_eq!(span.location_line(), 1); /// assert_eq!(span.get_column(), 1); /// assert_eq!(span.fragment(), &&b"foobar"[..]); /// # } /// ``` pub fn new(program: T) -> LocatedSpan { LocatedSpan { offset: 0, line: 1, fragment: program, extra: (), } } } impl LocatedSpan { /// Create a span for a particular input with default `offset` and /// `line` values. You can compute the column through the `get_column` or `get_utf8_column` /// methods. /// /// `offset` starts at 0, `line` starts at 1, and `column` starts at 1. /// /// Do not use this constructor in parser functions; `nom` and /// `nom_locate` assume span offsets are relative to the beginning of the /// same input. In these cases, you probably want to use the /// `nom::traits::Slice` trait instead. /// /// # Example of use /// /// ``` /// # extern crate nom_locate; /// use nom_locate::LocatedSpan; /// /// # fn main() { /// let span = LocatedSpan::new_extra(b"foobar", "extra"); /// /// assert_eq!(span.location_offset(), 0); /// assert_eq!(span.location_line(), 1); /// assert_eq!(span.get_column(), 1); /// assert_eq!(span.fragment(), &&b"foobar"[..]); /// assert_eq!(span.extra, "extra"); /// # } /// ``` pub fn new_extra(program: T, extra: X) -> LocatedSpan { LocatedSpan { offset: 0, line: 1, fragment: program, extra: extra, } } /// Similar to `new_extra`, but allows overriding offset and line. /// This is unsafe, because giving an offset too large may result in /// undefined behavior, as some methods move back along the fragment /// assuming any negative index within the offset is valid. pub unsafe fn new_from_raw_offset( offset: usize, line: u32, fragment: T, extra: X, ) -> LocatedSpan { LocatedSpan { offset, line, fragment, extra, } } /// The offset represents the position of the fragment relatively to /// the input of the parser. It starts at offset 0. pub fn location_offset(&self) -> usize { self.offset } /// The line number of the fragment relatively to the input of the /// parser. It starts at line 1. pub fn location_line(&self) -> u32 { self.line } /// The fragment that is spanned. /// The fragment represents a part of the input of the parser. pub fn fragment(&self) -> &T { &self.fragment } /// Transform the extra inside into another type /// /// # Example of use /// ``` /// # extern crate nom_locate; /// # extern crate nom; /// # use nom_locate::LocatedSpan; /// use nom::{ /// IResult, /// combinator::{recognize, map_res}, /// sequence::{terminated, tuple}, /// character::{complete::{char, one_of}, is_digit}, /// bytes::complete::{tag, take_while1} /// }; /// /// fn decimal(input: LocatedSpan<&str>) -> IResult, LocatedSpan<&str>> { /// recognize( /// take_while1(|c| is_digit(c as u8) || c == '_') /// )(input) /// } /// /// fn main() { /// let span = LocatedSpan::new("$10"); /// // matches the $ and then matches the decimal number afterwards, /// // converting it into a `u8` and putting that value in the span /// let (_, (_, n)) = tuple(( /// tag("$"), /// map_res( /// decimal, /// |x| x.fragment().parse::().map(|n| x.map_extra(|_| n)) /// ) /// ))(span).unwrap(); /// assert_eq!(n.extra, 10); /// } /// ``` pub fn map_extra U>(self, f: F) -> LocatedSpan { LocatedSpan { offset: self.offset, line: self.line, fragment: self.fragment, extra: f(self.extra), } } /// Takes ownership of the fragment without (re)borrowing it. /// /// # Example of use /// ``` /// # extern crate nom_locate; /// # extern crate nom; /// # use nom_locate::LocatedSpan; /// use nom::{ /// IResult, /// bytes::complete::{take_till, tag}, /// combinator::rest, /// }; /// /// fn parse_pair<'a>(input: LocatedSpan<&'a str>) -> IResult, (&'a str, &'a str)> { /// let (input, key) = take_till(|c| c == '=')(input)?; /// let (input, _) = tag("=")(input)?; /// let (input, value) = rest(input)?; /// /// Ok((input, (key.into_fragment(), value.into_fragment()))) /// } /// /// fn main() { /// let span = LocatedSpan::new("key=value"); /// let (_, pair) = parse_pair(span).unwrap(); /// assert_eq!(pair, ("key", "value")); /// } /// ``` pub fn into_fragment(self) -> T { self.fragment } /// Takes ownership of the fragment and extra data without (re)borrowing them. pub fn into_fragment_and_extra(self) -> (T, X) { (self.fragment, self.extra) } } impl LocatedSpan { // Attempt to get the "original" data slice back, by extending // self.fragment backwards by self.offset. // Note that any bytes truncated from after self.fragment will not // be recovered. fn get_unoffsetted_slice(&self) -> &[u8] { let self_bytes = self.fragment.as_bytes(); let self_ptr = self_bytes.as_ptr(); unsafe { assert!( self.offset <= isize::max_value() as usize, "offset is too big" ); let orig_input_ptr = self_ptr.offset(-(self.offset as isize)); slice::from_raw_parts(orig_input_ptr, self.offset + self_bytes.len()) } } fn get_columns_and_bytes_before(&self) -> (usize, &[u8]) { let before_self = &self.get_unoffsetted_slice()[..self.offset]; let column = match memchr::memrchr(b'\n', before_self) { None => self.offset + 1, Some(pos) => self.offset - pos, }; (column, &before_self[self.offset - (column - 1)..]) } /// Return the line that contains this LocatedSpan. /// /// The `get_column` and `get_utf8_column` functions returns /// indexes that corresponds to the line returned by this function. /// /// Note that if this LocatedSpan ends before the end of the /// original data, the result of calling `get_line_beginning()` /// will not include any data from after the LocatedSpan. /// /// ``` /// # extern crate nom_locate; /// # extern crate nom; /// # use nom_locate::LocatedSpan; /// # use nom::{Slice, FindSubstring}; /// # /// # fn main() { /// let program = LocatedSpan::new( /// "Hello World!\ /// \nThis is a multi-line input\ /// \nthat ends after this line.\n"); /// let multi = program.find_substring("multi").unwrap(); /// /// assert_eq!( /// program.slice(multi..).get_line_beginning(), /// "This is a multi-line input".as_bytes(), /// ); /// # } /// ``` pub fn get_line_beginning(&self) -> &[u8] { let column0 = self.get_column() - 1; let the_line = &self.get_unoffsetted_slice()[self.offset - column0..]; match memchr::memchr(b'\n', &the_line[column0..]) { None => the_line, Some(pos) => &the_line[..column0 + pos], } } /// Return the column index, assuming 1 byte = 1 column. /// /// Use it for ascii text, or use get_utf8_column for UTF8. /// /// # Example of use /// ``` /// /// # extern crate nom_locate; /// # extern crate nom; /// # use nom_locate::LocatedSpan; /// # use nom::Slice; /// # /// # fn main() { /// let span = LocatedSpan::new("foobar"); /// /// assert_eq!(span.slice(3..).get_column(), 4); /// # } /// ``` pub fn get_column(&self) -> usize { self.get_columns_and_bytes_before().0 } /// Return the column index for UTF8 text. Return value is unspecified for non-utf8 text. /// /// This version uses bytecount's hyper algorithm to count characters. This is much faster /// for long lines, but is non-negligibly slower for short slices (below around 100 bytes). /// This is also sped up significantly more depending on architecture and enabling the simd /// feature gates. If you expect primarily short lines, you may get a noticeable speedup in /// parsing by using `naive_get_utf8_column` instead. Benchmark your specific use case! /// /// # Example of use /// ``` /// /// # extern crate nom_locate; /// # extern crate nom; /// # use nom_locate::LocatedSpan; /// # use nom::{Slice, FindSubstring}; /// # /// # fn main() { /// let span = LocatedSpan::new("メカジキ"); /// let indexOf3dKanji = span.find_substring("ジ").unwrap(); /// /// assert_eq!(span.slice(indexOf3dKanji..).get_column(), 7); /// assert_eq!(span.slice(indexOf3dKanji..).get_utf8_column(), 3); /// # } /// ``` pub fn get_utf8_column(&self) -> usize { let before_self = self.get_columns_and_bytes_before().1; num_chars(before_self) + 1 } /// Return the column index for UTF8 text. Return value is unspecified for non-utf8 text. /// /// A simpler implementation of `get_utf8_column` that may be faster on shorter lines. /// If benchmarking shows that this is faster, you can use it instead of `get_utf8_column`. /// Prefer defaulting to `get_utf8_column` unless this legitimately is a performance bottleneck. /// /// # Example of use /// ``` /// /// # extern crate nom_locate; /// # extern crate nom; /// # use nom_locate::LocatedSpan; /// # use nom::{Slice, FindSubstring}; /// # /// # fn main() { /// let span = LocatedSpan::new("メカジキ"); /// let indexOf3dKanji = span.find_substring("ジ").unwrap(); /// /// assert_eq!(span.slice(indexOf3dKanji..).get_column(), 7); /// assert_eq!(span.slice(indexOf3dKanji..).naive_get_utf8_column(), 3); /// # } /// ``` pub fn naive_get_utf8_column(&self) -> usize { let before_self = self.get_columns_and_bytes_before().1; naive_num_chars(before_self) + 1 } } impl Hash for LocatedSpan { fn hash(&self, state: &mut H) { self.offset.hash(state); self.line.hash(state); self.fragment.hash(state); } } impl From for LocatedSpan { fn from(i: T) -> Self { Self::new_extra(i, X::default()) } } impl PartialEq for LocatedSpan { fn eq(&self, other: &Self) -> bool { self.line == other.line && self.offset == other.offset && self.fragment == other.fragment } } impl Eq for LocatedSpan {} impl AsBytes for LocatedSpan { fn as_bytes(&self) -> &[u8] { self.fragment.as_bytes() } } impl InputLength for LocatedSpan { fn input_len(&self) -> usize { self.fragment.input_len() } } impl InputTake for LocatedSpan where Self: Slice> + Slice>, { fn take(&self, count: usize) -> Self { self.slice(..count) } fn take_split(&self, count: usize) -> (Self, Self) { (self.slice(count..), self.slice(..count)) } } impl InputTakeAtPosition for LocatedSpan where T: InputTakeAtPosition + InputLength + InputIter, Self: Slice> + Slice> + Clone, { type Item = ::Item; fn split_at_position_complete>( &self, predicate: P, ) -> IResult where P: Fn(Self::Item) -> bool, { match self.split_at_position(predicate) { Err(Err::Incomplete(_)) => Ok(self.take_split(self.input_len())), res => res, } } fn split_at_position>(&self, predicate: P) -> IResult where P: Fn(Self::Item) -> bool, { match self.fragment.position(predicate) { Some(n) => Ok(self.take_split(n)), None => Err(Err::Incomplete(nom::Needed::new(1))), } } fn split_at_position1>( &self, predicate: P, e: ErrorKind, ) -> IResult where P: Fn(Self::Item) -> bool, { match self.fragment.position(predicate) { Some(0) => Err(Err::Error(E::from_error_kind(self.clone(), e))), Some(n) => Ok(self.take_split(n)), None => Err(Err::Incomplete(nom::Needed::new(1))), } } fn split_at_position1_complete>( &self, predicate: P, e: ErrorKind, ) -> IResult where P: Fn(Self::Item) -> bool, { match self.fragment.position(predicate) { Some(0) => Err(Err::Error(E::from_error_kind(self.clone(), e))), Some(n) => Ok(self.take_split(n)), None => { if self.fragment.input_len() == 0 { Err(Err::Error(E::from_error_kind(self.clone(), e))) } else { Ok(self.take_split(self.input_len())) } } } } } #[macro_export] #[deprecated( since = "3.1.0", note = "this implementation has been generalized and no longer requires a macro" )] macro_rules! impl_input_iter { () => {}; } impl<'a, T, X> InputIter for LocatedSpan where T: InputIter, { type Item = T::Item; type Iter = T::Iter; type IterElem = T::IterElem; #[inline] fn iter_indices(&self) -> Self::Iter { self.fragment.iter_indices() } #[inline] fn iter_elements(&self) -> Self::IterElem { self.fragment.iter_elements() } #[inline] fn position

(&self, predicate: P) -> Option where P: Fn(Self::Item) -> bool, { self.fragment.position(predicate) } #[inline] fn slice_index(&self, count: usize) -> Result { self.fragment.slice_index(count) } } impl, B: Into>, X> Compare for LocatedSpan { #[inline(always)] fn compare(&self, t: B) -> CompareResult { self.fragment.compare(t.into().fragment) } #[inline(always)] fn compare_no_case(&self, t: B) -> CompareResult { self.fragment.compare_no_case(t.into().fragment) } } #[macro_export] #[deprecated( since = "2.1.0", note = "this implementation has been generalized and no longer requires a macro" )] macro_rules! impl_compare { ( $fragment_type:ty, $compare_to_type:ty ) => {}; } #[macro_export] #[deprecated( since = "3.1.0", note = "this implementation has been generalized and no longer requires a macro" )] macro_rules! impl_slice_range { ( $fragment_type:ty, $range_type:ty, $can_return_self:expr ) => {}; } #[macro_export] #[deprecated( since = "3.1.0", note = "this implementation has been generalized and no longer requires a macro" )] macro_rules! impl_slice_ranges { ( $fragment_type:ty ) => {}; } impl<'a, T, R, X: Clone> Slice for LocatedSpan where T: Slice + Offset + AsBytes + Slice>, { fn slice(&self, range: R) -> Self { let next_fragment = self.fragment.slice(range); let consumed_len = self.fragment.offset(&next_fragment); if consumed_len == 0 { return LocatedSpan { line: self.line, offset: self.offset, fragment: next_fragment, extra: self.extra.clone(), }; } let consumed = self.fragment.slice(..consumed_len); let next_offset = self.offset + consumed_len; let consumed_as_bytes = consumed.as_bytes(); let iter = Memchr::new(b'\n', consumed_as_bytes); let number_of_lines = iter.count() as u32; let next_line = self.line + number_of_lines; LocatedSpan { line: next_line, offset: next_offset, fragment: next_fragment, extra: self.extra.clone(), } } } impl, Token, X> FindToken for LocatedSpan { fn find_token(&self, token: Token) -> bool { self.fragment.find_token(token) } } impl FindSubstring for LocatedSpan where T: FindSubstring, { #[inline] fn find_substring(&self, substr: U) -> Option { self.fragment.find_substring(substr) } } impl ParseTo for LocatedSpan where T: ParseTo, { #[inline] fn parse_to(&self) -> Option { self.fragment.parse_to() } } impl Offset for LocatedSpan { fn offset(&self, second: &Self) -> usize { let fst = self.offset; let snd = second.offset; snd - fst } } #[cfg(feature = "alloc")] impl Display for LocatedSpan { fn fmt(&self, fmt: &mut Formatter) -> FmtResult { fmt.write_str(&self.fragment.to_string()) } } #[macro_export] #[deprecated( since = "3.1.0", note = "this implementation has been generalized and no longer requires a macro" )] macro_rules! impl_extend_into { ($fragment_type:ty, $item:ty, $extender:ty) => { impl<'a, X> ExtendInto for LocatedSpan<$fragment_type, X> { type Item = $item; type Extender = $extender; #[inline] fn new_builder(&self) -> Self::Extender { self.fragment.new_builder() } #[inline] fn extend_into(&self, acc: &mut Self::Extender) { self.fragment.extend_into(acc) } } }; } #[cfg(feature = "alloc")] impl<'a, T, X> ExtendInto for LocatedSpan where T: ExtendInto, { type Item = T::Item; type Extender = T::Extender; #[inline] fn new_builder(&self) -> Self::Extender { self.fragment.new_builder() } #[inline] fn extend_into(&self, acc: &mut Self::Extender) { self.fragment.extend_into(acc) } } #[cfg(feature = "std")] #[macro_export] #[deprecated( since = "2.1.0", note = "this implementation has been generalized and no longer requires a macro" )] macro_rules! impl_hex_display { ($fragment_type:ty) => {}; } /// Capture the position of the current fragment #[macro_export] macro_rules! position { ($input:expr,) => { tag!($input, "") }; } /// Capture the position of the current fragment pub fn position(s: T) -> IResult where E: ParseError, T: InputIter + InputTake, { nom::bytes::complete::take(0usize)(s) } nom_locate-4.2.0/src/tests.rs000066400000000000000000000352201447116175100161450ustar00rootroot00000000000000mod lib { #[cfg(feature = "std")] pub mod std { pub use std::string::ToString; pub use std::vec::Vec; } #[cfg(all(not(feature = "std"), feature = "alloc"))] pub mod std { pub use alloc::string::ToString; pub use alloc::vec::Vec; } } #[cfg(feature = "alloc")] use lib::std::*; use super::LocatedSpan; #[cfg(feature = "alloc")] use nom::ParseTo; use nom::{ error::ErrorKind, Compare, CompareResult, FindSubstring, FindToken, InputIter, InputTake, InputTakeAtPosition, Offset, Slice, }; type StrSpan<'a> = LocatedSpan<&'a str>; type BytesSpan<'a> = LocatedSpan<&'a [u8]>; type StrSpanEx<'a, 'b> = LocatedSpan<&'a str, &'b str>; type BytesSpanEx<'a, 'b> = LocatedSpan<&'a [u8], &'b str>; #[test] fn new_sould_be_the_same_as_new_extra() { let byteinput = &b"foobar"[..]; assert_eq!( BytesSpan::new(byteinput), LocatedSpan::new_extra(byteinput, ()) ); let strinput = "foobar"; assert_eq!(StrSpan::new(strinput), LocatedSpan::new_extra(strinput, ())); } #[test] fn it_should_call_new_for_u8_successfully() { let input = &b"foobar"[..]; let output = BytesSpan { offset: 0, line: 1, fragment: input, extra: (), }; assert_eq!(BytesSpan::new(input), output); } #[test] fn it_should_convert_from_u8_successfully() { let input = &b"foobar"[..]; assert_eq!(BytesSpan::new(input), input.into()); assert_eq!(BytesSpanEx::new_extra(input, "extra"), input.into()); } #[test] fn it_should_call_new_for_str_successfully() { let input = &"foobar"[..]; let output = StrSpan { offset: 0, line: 1, fragment: input, extra: (), }; assert_eq!(StrSpan::new(input), output); } #[test] fn it_should_convert_from_str_successfully() { let input = &"foobar"[..]; assert_eq!(StrSpan::new(input), input.into()); assert_eq!(StrSpanEx::new_extra(input, "extra"), input.into()); } #[test] fn it_should_ignore_extra_for_equality() { let input = &"foobar"[..]; assert_eq!( StrSpanEx::new_extra(input, "foo"), StrSpanEx::new_extra(input, "bar") ); } #[cfg(feature = "std")] #[test] fn it_should_ignore_extra_for_hash() { use std::collections::hash_map::DefaultHasher; use std::hash::{Hash, Hasher}; fn calculate_hash(t: &T) -> u64 { let mut s = DefaultHasher::new(); t.hash(&mut s); s.finish() } let input = &"foobar"[..]; assert_eq!( calculate_hash(&StrSpanEx::new_extra(input, "foo")), calculate_hash(&StrSpanEx::new_extra(input, "bar")) ); } #[test] fn it_should_slice_for_str() { let str_slice = StrSpanEx::new_extra("foobar", "extra"); assert_eq!( str_slice.slice(1..), StrSpanEx { offset: 1, line: 1, fragment: "oobar", extra: "extra", } ); assert_eq!( str_slice.slice(1..3), StrSpanEx { offset: 1, line: 1, fragment: "oo", extra: "extra", } ); assert_eq!( str_slice.slice(..3), StrSpanEx { offset: 0, line: 1, fragment: "foo", extra: "extra", } ); assert_eq!(str_slice.slice(..), str_slice); } #[test] fn it_should_slice_for_u8() { let bytes_slice = BytesSpanEx::new_extra(b"foobar", "extra"); assert_eq!( bytes_slice.slice(1..), BytesSpanEx { offset: 1, line: 1, fragment: b"oobar", extra: "extra", } ); assert_eq!( bytes_slice.slice(1..3), BytesSpanEx { offset: 1, line: 1, fragment: b"oo", extra: "extra", } ); assert_eq!( bytes_slice.slice(..3), BytesSpanEx { offset: 0, line: 1, fragment: b"foo", extra: "extra", } ); assert_eq!(bytes_slice.slice(..), bytes_slice); } #[test] fn it_should_calculate_columns() { let input = StrSpan::new( "foo bar", ); let bar_idx = input.find_substring("bar").unwrap(); assert_eq!(input.slice(bar_idx..).get_column(), 9); } #[test] fn it_should_calculate_columns_accurately_with_non_ascii_chars() { let s = StrSpan::new("メカジキ"); assert_eq!(s.slice(6..).get_utf8_column(), 3); } #[test] #[should_panic(expected = "offset is too big")] fn it_should_panic_when_getting_column_if_offset_is_too_big() { let s = StrSpanEx { offset: usize::max_value(), fragment: "", line: 1, extra: "", }; s.get_column(); } #[cfg(feature = "alloc")] #[test] fn it_should_iterate_indices() { let str_slice = StrSpan::new("foobar"); assert_eq!( str_slice.iter_indices().collect::>(), vec![(0, 'f'), (1, 'o'), (2, 'o'), (3, 'b'), (4, 'a'), (5, 'r')] ); assert_eq!( StrSpan::new("") .iter_indices() .collect::>(), vec![] ); } #[cfg(feature = "alloc")] #[test] fn it_should_iterate_elements() { let str_slice = StrSpan::new("foobar"); assert_eq!( str_slice.iter_elements().collect::>(), vec!['f', 'o', 'o', 'b', 'a', 'r'] ); assert_eq!( StrSpan::new("").iter_elements().collect::>(), vec![] ); } #[test] fn it_should_position_char() { let str_slice = StrSpan::new("foobar"); assert_eq!(str_slice.position(|x| x == 'a'), Some(4)); assert_eq!(str_slice.position(|x| x == 'c'), None); } #[test] fn it_should_compare_elements() { assert_eq!(StrSpan::new("foobar").compare("foo"), CompareResult::Ok); assert_eq!(StrSpan::new("foobar").compare("bar"), CompareResult::Error); assert_eq!(StrSpan::new("foobar").compare("foobar"), CompareResult::Ok); assert_eq!( StrSpan::new("foobar").compare_no_case("fooBar"), CompareResult::Ok ); assert_eq!( StrSpan::new("foobar").compare("foobarbaz"), CompareResult::Incomplete ); assert_eq!( BytesSpan::new(b"foobar").compare(b"foo" as &[u8]), CompareResult::Ok ); } #[test] #[allow(unused_parens)] fn it_should_find_token() { assert!(StrSpan::new("foobar").find_token('a')); assert!(StrSpan::new("foobar").find_token(b'a')); assert!(StrSpan::new("foobar").find_token(&(b'a'))); assert!(!StrSpan::new("foobar").find_token('c')); assert!(!StrSpan::new("foobar").find_token(b'c')); assert!(!StrSpan::new("foobar").find_token((&b'c'))); assert!(BytesSpan::new(b"foobar").find_token(b'a')); assert!(BytesSpan::new(b"foobar").find_token(&(b'a'))); assert!(!BytesSpan::new(b"foobar").find_token(b'c')); assert!(!BytesSpan::new(b"foobar").find_token((&b'c'))); } #[test] fn it_should_find_substring() { assert_eq!(StrSpan::new("foobar").find_substring("bar"), Some(3)); assert_eq!(StrSpan::new("foobar").find_substring("baz"), None); assert_eq!(BytesSpan::new(b"foobar").find_substring("bar"), Some(3)); assert_eq!(BytesSpan::new(b"foobar").find_substring("baz"), None); assert_eq!( BytesSpan::new(b"foobar").find_substring(b"bar" as &[u8]), Some(3) ); assert_eq!( BytesSpan::new(b"foobar").find_substring(b"baz" as &[u8]), None ); } #[cfg(feature = "alloc")] #[test] fn it_should_parse_to_string() { assert_eq!( StrSpan::new("foobar").parse_to(), Some("foobar".to_string()) ); assert_eq!( BytesSpan::new(b"foobar").parse_to(), Some("foobar".to_string()) ); } // https://github.com/Geal/nom/blob/eee82832fafdfdd0505546d224caa466f7d39a15/src/util.rs#L710-L720 #[test] fn it_should_calculate_offset_for_u8() { let s = b"abcd123"; let a = &s[..]; let b = &a[2..]; let c = &a[..4]; let d = &a[3..5]; assert_eq!(a.offset(b), 2); assert_eq!(a.offset(c), 0); assert_eq!(a.offset(d), 3); } // https://github.com/Geal/nom/blob/eee82832fafdfdd0505546d224caa466f7d39a15/src/util.rs#L722-L732 #[test] fn it_should_calculate_offset_for_str() { let s = StrSpan::new("abcřèÂßÇd123"); let a = s.slice(..); let b = a.slice(7..); let c = a.slice(..5); let d = a.slice(5..9); assert_eq!(a.offset(&b), 7); assert_eq!(a.offset(&c), 0); assert_eq!(a.offset(&d), 5); } #[test] fn it_should_take_chars() { let s = StrSpanEx::new_extra("abcdefghij", "extra"); assert_eq!( s.take(5), StrSpanEx { offset: 0, line: 1, fragment: "abcde", extra: "extra", } ); } #[test] fn it_should_take_split_chars() { let s = StrSpanEx::new_extra("abcdefghij", "extra"); assert_eq!( s.take_split(5), ( StrSpanEx { offset: 5, line: 1, fragment: "fghij", extra: "extra", }, StrSpanEx { offset: 0, line: 1, fragment: "abcde", extra: "extra", } ) ); } type TestError<'a, 'b> = (LocatedSpan<&'a str, &'b str>, nom::error::ErrorKind); #[test] fn it_should_split_at_position() { let s = StrSpanEx::new_extra("abcdefghij", "extra"); assert_eq!( s.split_at_position::<_, TestError>(|c| { c == 'f' }), Ok(( StrSpanEx { offset: 5, line: 1, fragment: "fghij", extra: "extra", }, StrSpanEx { offset: 0, line: 1, fragment: "abcde", extra: "extra", } )) ); } // TODO also test split_at_position with an error #[test] fn it_should_split_at_position1() { let s = StrSpanEx::new_extra("abcdefghij", "extra"); assert_eq!( s.split_at_position1::<_, TestError>(|c| { c == 'f' }, ErrorKind::Alpha), s.split_at_position::<_, TestError>(|c| { c == 'f' }), ); } #[test] fn it_should_capture_position() { use super::position; use nom::bytes::complete::{tag, take_until}; use nom::IResult; fn parser<'a>(s: StrSpan<'a>) -> IResult, (StrSpan<'a>, &'a str)> { let (s, _) = take_until("def")(s)?; let (s, p) = position(s)?; let (s, t) = tag("def")(s)?; Ok((s, (p, t.fragment))) } let s = StrSpan::new("abc\ndefghij"); let (_, (s, t)) = parser(s).unwrap(); assert_eq!(s.offset, 4); assert_eq!(s.line, 2); assert_eq!(t, "def"); } #[test] fn it_should_deref_to_fragment() { let input = &"foobar"[..]; assert_eq!(*StrSpanEx::new_extra(input, "extra"), input); let input = &b"foobar"[..]; assert_eq!(*BytesSpanEx::new_extra(input, "extra"), input); } #[cfg(feature = "std")] #[test] fn it_should_display_hex() { use nom::HexDisplay; assert_eq!( StrSpan::new(&"abc"[..]).to_hex(4), "00000000\t61 62 63 \tabc\n".to_owned() ); assert_eq!( BytesSpanEx::new_extra(&b"abc"[..], "extra").to_hex(4), "00000000\t61 62 63 \tabc\n".to_owned() ); } #[test] fn line_of_empty_span_is_empty() { assert_eq!(StrSpan::new("").get_line_beginning(), "".as_bytes()); } #[test] fn line_of_single_line_start_is_whole() { assert_eq!( StrSpan::new("A single line").get_line_beginning(), "A single line".as_bytes(), ); } #[test] fn line_of_single_line_end_is_whole() { let data = "A single line"; assert_eq!( StrSpan::new(data).slice(data.len()..).get_line_beginning(), "A single line".as_bytes(), ); } #[test] fn line_of_start_is_first() { assert_eq!( StrSpan::new( "One line of text\ \nFollowed by a second\ \nand a third\n" ) .get_line_beginning(), "One line of text".as_bytes(), ); } #[test] fn line_of_nl_is_before() { let data = "One line of text\ \nFollowed by a second\ \nand a third\n"; assert_eq!( StrSpan::new(data) .slice(data.find('\n').unwrap()..) .get_line_beginning(), "One line of text".as_bytes(), ); } #[test] fn line_of_end_after_nl_is_empty() { let data = "One line of text\ \nFollowed by a second\ \nand a third\n"; assert_eq!( StrSpan::new(data).slice(data.len()..).get_line_beginning(), "".as_bytes(), ); } #[test] fn line_of_end_no_nl_is_last() { let data = "One line of text\ \nFollowed by a second\ \nand a third"; assert_eq!( StrSpan::new(data).slice(data.len()..).get_line_beginning(), "and a third".as_bytes(), ); } /// This test documents how `get_line_beginning()` differs from /// a hypotetical `get_line()` method. #[test] fn line_begining_may_ot_be_entire_len() { let data = "One line of text\ \nFollowed by a second\ \nand a third"; let by = "by"; let pos = data.find_substring(by).unwrap(); assert_eq!( StrSpan::new(data) .slice(pos..pos + by.len()) .get_line_beginning(), "Followed by".as_bytes(), ); } #[cfg(feature = "std")] #[test] fn line_for_non_ascii_chars() { let data = StrSpan::new( "Några rader text på Svenska.\ \nFörra raden var först, den här är i mitten\ \noch här är sista raden.\n", ); let s = data.slice(data.find_substring("först").unwrap()..); assert_eq!( format!( "{line_no:3}: {line_text}\n {0:>lpos$}^- The match\n", "", line_no = s.location_line(), line_text = core::str::from_utf8(s.get_line_beginning()).unwrap(), lpos = s.get_utf8_column(), ), " 2: Förra raden var först, den här är i mitten\ \n ^- The match\n", ); } #[test] fn it_should_implement_as_ref_for_the_underlying_type() { // LocatedSpan<&str> should implement AsRef. { fn function_accepting_str>(_s: S) {} let str_data = StrSpan::new("some data"); function_accepting_str(str_data); } // LocatedSpan<&[u8]> should implement AsRef<[u8]>. { fn function_accepting_u8_slice>(_data: B) {} let bytes_data = BytesSpan::new(b"some binary data"); function_accepting_u8_slice(bytes_data); } } #[cfg(feature = "std")] #[test] fn it_should_implement_as_ref_impls_for_the_underlying_type() { // Since str implements AsRef, it's useful to have LocatedSpan<&str> // implement AsRef as well. fn function_accepting_path>(_path: P) {} let str_data = StrSpan::new("some data"); function_accepting_path(str_data); } nom_locate-4.2.0/tests/000077500000000000000000000000001447116175100150065ustar00rootroot00000000000000nom_locate-4.2.0/tests/integration_tests.rs000066400000000000000000000247101447116175100211250ustar00rootroot00000000000000use nom::{error::ErrorKind, error_position, AsBytes, FindSubstring, IResult, InputLength, Slice}; use nom_locate::LocatedSpan; use std::cmp; use std::fmt::Debug; use std::ops::{Range, RangeFull}; #[cfg(feature = "alloc")] use nom::bytes::complete::escaped_transform; #[cfg(any(feature = "std", feature = "alloc"))] use nom::{ bytes::complete::{tag, take_until}, character::complete::{char, multispace0}, combinator::eof, multi::many0, sequence::{delimited, preceded}, }; type StrSpan<'a> = LocatedSpan<&'a str>; type BytesSpan<'a> = LocatedSpan<&'a [u8]>; #[cfg(any(feature = "std", feature = "alloc"))] fn simple_parser_str(i: StrSpan) -> IResult> { let (i, foo) = delimited(multispace0, tag("foo"), multispace0)(i)?; let (i, bar) = delimited(multispace0, tag("bar"), multispace0)(i)?; let (i, baz) = many0(delimited(multispace0, tag("baz"), multispace0))(i)?; let (i, eof) = eof(i)?; Ok({ let mut res = vec![foo, bar]; res.extend(baz); res.push(eof); (i, res) }) } #[cfg(any(feature = "std", feature = "alloc"))] fn simple_parser_u8(i: BytesSpan) -> IResult> { let (i, foo) = delimited(multispace0, tag("foo"), multispace0)(i)?; let (i, bar) = delimited(multispace0, tag("bar"), multispace0)(i)?; let (i, baz) = many0(delimited(multispace0, tag("baz"), multispace0))(i)?; let (i, eof) = eof(i)?; Ok({ let mut res = vec![foo, bar]; res.extend(baz); res.push(eof); (i, res) }) } struct Position { line: u32, column: usize, offset: usize, fragment_len: usize, } fn test_str_fragments<'a, F, T>(parser: F, input: T, positions: Vec) where F: Fn(LocatedSpan) -> IResult, Vec>>, T: InputLength + Slice> + Slice + Debug + PartialEq + AsBytes, { let res = parser(LocatedSpan::new(input.slice(..))) .map_err(|err| { eprintln!( "for={:?} -- The parser should run successfully\n{:?}", input, err ); format!("The parser should run successfully") }) .unwrap(); // assert!(res.is_ok(), "the parser should run successfully"); let (remaining, output) = res; assert!( remaining.fragment().input_len() == 0, "no input should remain" ); assert_eq!(output.len(), positions.len()); for (output_item, pos) in output.iter().zip(positions.iter()) { assert_eq!(output_item.location_offset(), pos.offset); assert_eq!(output_item.location_line(), pos.line); assert_eq!( output_item.fragment(), &input.slice(pos.offset..cmp::min(pos.offset + pos.fragment_len, input.input_len())) ); assert_eq!( output_item.get_utf8_column(), pos.column, "columns should be equal" ); } } #[cfg(any(feature = "std", feature = "alloc"))] #[test] fn it_locates_str_fragments() { test_str_fragments( simple_parser_str, "foobarbaz", vec![ Position { line: 1, column: 1, offset: 0, fragment_len: 3, }, Position { line: 1, column: 4, offset: 3, fragment_len: 3, }, Position { line: 1, column: 7, offset: 6, fragment_len: 3, }, Position { line: 1, column: 10, offset: 9, fragment_len: 3, }, ], ); test_str_fragments( simple_parser_str, " foo bar baz", vec![ Position { line: 1, column: 2, offset: 1, fragment_len: 3, }, Position { line: 2, column: 9, offset: 13, fragment_len: 3, }, Position { line: 3, column: 13, offset: 29, fragment_len: 3, }, Position { line: 3, column: 16, offset: 32, fragment_len: 3, }, ], ); } #[cfg(any(feature = "std", feature = "alloc"))] #[test] fn it_locates_u8_fragments() { test_str_fragments( simple_parser_u8, b"foobarbaz", vec![ Position { line: 1, column: 1, offset: 0, fragment_len: 3, }, Position { line: 1, column: 4, offset: 3, fragment_len: 3, }, Position { line: 1, column: 7, offset: 6, fragment_len: 3, }, Position { line: 1, column: 10, offset: 9, fragment_len: 3, }, ], ); test_str_fragments( simple_parser_u8, b" foo bar baz", vec![ Position { line: 1, column: 2, offset: 1, fragment_len: 3, }, Position { line: 2, column: 9, offset: 13, fragment_len: 3, }, Position { line: 3, column: 13, offset: 29, fragment_len: 3, }, Position { line: 3, column: 16, offset: 32, fragment_len: 3, }, ], ); } fn find_substring<'a>( input: StrSpan<'a>, substr: &'static str, ) -> IResult, StrSpan<'a>> { let substr_len = substr.len(); match input.find_substring(substr) { None => Err(nom::Err::Error(error_position!(input, ErrorKind::Tag))), Some(pos) => Ok(( input.slice(pos + substr_len..), input.slice(pos..pos + substr_len), )), } } #[cfg(feature = "alloc")] #[test] fn test_escaped_string() { #[allow(unused)] use nom::Needed; // https://github.com/Geal/nom/issues/780 fn string(i: StrSpan) -> IResult { delimited( char('"'), escaped_transform( nom::character::complete::alpha1, '\\', nom::character::complete::anychar, ), char('"'), )(i) } let res = string(LocatedSpan::new("\"foo\\\"bar\"")); assert!(res.is_ok()); let (span, remaining) = res.unwrap(); assert_eq!(span.location_offset(), 10); assert_eq!(span.location_line(), 1); assert_eq!(span.fragment(), &""); assert_eq!(remaining, "foo\"bar".to_string()); } #[cfg(any(feature = "std", feature = "alloc"))] fn plague(i: StrSpan) -> IResult> { let (i, ojczyzno) = find_substring(i, "Ojczyzno")?; let (i, jak) = many0(|i| find_substring(i, "jak "))(i)?; let (i, zielona) = find_substring(i, "Zielona")?; let (i, _) = preceded(take_until("."), tag("."))(i)?; Ok({ let mut res = vec![ojczyzno]; res.extend(jak); res.push(zielona); (i, res) }) } #[cfg(any(feature = "std", feature = "alloc"))] #[test] fn it_locates_complex_fragments() { // Pan Tadeusz. https://pl.m.wikisource.org/wiki/Pan_Tadeusz_(wyd._1834)/Ksi%C4%99ga_pierwsza let input = "Litwo! Ojczyzno moja! ty jestes jak zdrowie; Ile cie trzeba cenic, ten tylko sie dowie Kto cie stracil. Dzis pieknosc twa w calej ozdobie Widze i opisuje, bo tesknie po tobie. Panno swieta, co jasnej bronisz Czestochowy I w Ostrej swiecisz Bramie! Ty, co grod zamkowy Nowogrodzki ochraniasz z jego wiernym ludem! Jak mnie dziecko do zdrowia powrocilas cudem, (Gdy od placzacej matki, pod Twoje opieke Ofiarowany, martwa podnioslem powieke; I zaraz moglem pieszo, do Twych swiatyn progu Isc za wrocone zycie podziekowac Bogu;) Tak nas powrocisz cudem na Ojczyzny lono. Tymczasem przenos moje dusze uteskniona Do tych pagorkow lesnych, do tych lak zielonych, Szeroko nad blekitnym Niemnem rosciagnionych; Do tych pol malowanych zbozem rozmaitem, Wyzlacanych pszenica, posrebrzanych zytem; Gdzie bursztynowy swierzop, gryka jak snieg biala, Gdzie panienskim rumiencem dziecielina pala, A wszystko przepasane jakby wstega, miedza Zielona, na niej zrzadka ciche grusze siedza."; let expected = vec![ Position { line: 1, column: 8, offset: 7, fragment_len: 8, }, Position { line: 1, column: 33, offset: 32, fragment_len: 4, }, Position { line: 21, column: 35, offset: 823, fragment_len: 4, }, Position { line: 24, column: 1, offset: 928, fragment_len: 7, }, ]; test_str_fragments(plague, input, expected); } #[cfg(any(feature = "std", feature = "alloc"))] #[test] fn test_take_until_str() { fn parser(i: StrSpan) -> IResult { let (i, _) = delimited(take_until("foo"), tag("foo"), multispace0)(i)?; let (i, _) = delimited(take_until("bar"), tag("bar"), multispace0)(i)?; let (i, _) = eof(i)?; Ok((i, ())) } let res = parser(LocatedSpan::new(" X foo Y bar ")); assert!(res.is_ok()); let (span, _) = res.unwrap(); assert_eq!(span.location_offset(), 13); assert_eq!(span.location_line(), 1); assert_eq!(*span.fragment(), ""); } #[cfg(any(feature = "std", feature = "alloc"))] #[test] fn test_take_until_u8() { fn parser(i: BytesSpan) -> IResult { // Mix string and byte conditions. let (i, _) = delimited(take_until("foo"), tag("foo"), multispace0)(i)?; let (i, _) = delimited(take_until(&b"bar"[..]), tag(&b"bar"[..]), multispace0)(i)?; let (i, _) = eof(i)?; Ok((i, ())) } let res = parser(LocatedSpan::new(&b" X foo Y bar "[..])); assert!(res.is_ok()); let (span, _) = res.unwrap(); assert_eq!(span.location_offset(), 13); assert_eq!(span.location_line(), 1); assert_eq!(*span.fragment(), &b""[..]); }