1 2<p align="center"> 3 <img src="https://raw.github.com/pest-parser/pest/master/pest-logo.svg?sanitize=true" width="80%"/> 4</p> 5 6# pest. The Elegant Parser 7 8[](https://gitter.im/pest-parser/pest?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) 9[](https://pest.rs/book) 10[](https://docs.rs/pest) 11 12[](https://github.com/pest-parser/pest/actions/workflows/ci.yml) 13[](https://codecov.io/gh/pest-parser/pest) 14<a href="https://blog.rust-lang.org/2021/11/01/Rust-1.61.0.html"><img alt="Rustc Version 1.61.0+" src="https://img.shields.io/badge/rustc-1.61.0%2B-lightgrey.svg"/></a> 15 16[](https://crates.io/crates/pest) 17[](https://crates.io/crates/pest) 18 19pest is a general purpose parser written in Rust with a focus on accessibility, 20correctness, and performance. It uses parsing expression grammars 21(or [PEG]) as input, which are similar in spirit to regular expressions, but 22which offer the enhanced expressivity needed to parse complex languages. 23 24[PEG]: https://en.wikipedia.org/wiki/Parsing_expression_grammar 25 26## Getting started 27 28The recommended way to start parsing with pest is to read the official [book]. 29 30Other helpful resources: 31 32* API reference on [docs.rs] 33* play with grammars and share them on our [fiddle] 34* find previous common questions answered or ask questions on [GitHub Discussions] 35* leave feedback, ask questions, or greet us on [Gitter] or [Discord] 36 37[book]: https://pest.rs/book 38[docs.rs]: https://docs.rs/pest 39[fiddle]: https://pest.rs/#editor 40[Gitter]: https://gitter.im/pest-parser/pest 41[Discord]: https://discord.gg/XEGACtWpT2 42[GitHub Discussions]: https://github.com/pest-parser/pest/discussions 43 44## Example 45 46The following is an example of a grammar for a list of alphanumeric identifiers 47where all identifiers don't start with a digit: 48 49```rust 50alpha = { 'a'..'z' | 'A'..'Z' } 51digit = { '0'..'9' } 52 53ident = { !digit ~ (alpha | digit)+ } 54 55ident_list = _{ ident ~ (" " ~ ident)* } 56 // ^ 57 // ident_list rule is silent which means it produces no tokens 58``` 59 60Grammars are saved in separate .pest files which are never mixed with procedural 61code. This results in an always up-to-date formalization of a language that is 62easy to read and maintain. 63 64## Meaningful error reporting 65 66Based on the grammar definition, the parser also includes automatic error 67reporting. For the example above, the input `"123"` will result in: 68 69``` 70thread 'main' panicked at ' --> 1:1 71 | 721 | 123 73 | ^--- 74 | 75 = unexpected digit', src/main.rs:12 76``` 77while `"ab *"` will result in: 78``` 79thread 'main' panicked at ' --> 1:1 80 | 811 | ab * 82 | ^--- 83 | 84 = expected ident', src/main.rs:12 85``` 86 87These error messages can be obtained from their default `Display` implementation, 88e.g. `panic!("{}", parser_result.unwrap_err())` or `println!("{}", e)`. 89 90## Pairs API 91 92The grammar can be used to derive a `Parser` implementation automatically. 93Parsing returns an iterator of nested token pairs: 94 95```rust 96use pest_derive::Parser; 97use pest::Parser; 98 99#[derive(Parser)] 100#[grammar = "ident.pest"] 101struct IdentParser; 102 103fn main() { 104 let pairs = IdentParser::parse(Rule::ident_list, "a1 b2").unwrap_or_else(|e| panic!("{}", e)); 105 106 // Because ident_list is silent, the iterator will contain idents 107 for pair in pairs { 108 // A pair is a combination of the rule which matched and a span of input 109 println!("Rule: {:?}", pair.as_rule()); 110 println!("Span: {:?}", pair.as_span()); 111 println!("Text: {}", pair.as_str()); 112 113 // A pair can be converted to an iterator of the tokens which make it up: 114 for inner_pair in pair.into_inner() { 115 match inner_pair.as_rule() { 116 Rule::alpha => println!("Letter: {}", inner_pair.as_str()), 117 Rule::digit => println!("Digit: {}", inner_pair.as_str()), 118 _ => unreachable!() 119 }; 120 } 121 } 122} 123``` 124 125This produces the following output: 126``` 127Rule: ident 128Span: Span { start: 0, end: 2 } 129Text: a1 130Letter: a 131Digit: 1 132Rule: ident 133Span: Span { start: 3, end: 5 } 134Text: b2 135Letter: b 136Digit: 2 137``` 138 139### Defining multiple parsers in a single file 140The current automatic `Parser` derivation will produce the `Rule` enum 141which would have name conflicts if one tried to define multiple such structs 142that automatically derive `Parser`. One possible way around it is to put each 143parser struct in a separate namespace: 144 145```rust 146mod a { 147 #[derive(Parser)] 148 #[grammar = "a.pest"] 149 pub struct ParserA; 150} 151mod b { 152 #[derive(Parser)] 153 #[grammar = "b.pest"] 154 pub struct ParserB; 155} 156``` 157 158## Other features 159 160* Precedence climbing 161* Input handling 162* Custom errors 163* Runs on stable Rust 164 165## Projects using pest 166 167You can find more projects and ecosystem tools in the [awesome-pest](https://github.com/pest-parser/awesome-pest) repo. 168 169* [pest_meta](https://github.com/pest-parser/pest/blob/master/meta/src/grammar.pest) (bootstrapped) 170* [AshPaper](https://github.com/shnewto/ashpaper) 171* [brain](https://github.com/brain-lang/brain) 172* [cicada](https://github.com/mitnk/cicada) 173* [comrak](https://github.com/kivikakk/comrak) 174* [elastic-rs](https://github.com/cch123/elastic-rs) 175* [graphql-parser](https://github.com/Keats/graphql-parser) 176* [handlebars-rust](https://github.com/sunng87/handlebars-rust) 177* [hexdino](https://github.com/Luz/hexdino) 178* [Huia](https://gitlab.com/jimsy/huia/) 179* [insta](https://github.com/mitsuhiko/insta) 180* [jql](https://github.com/yamafaktory/jql) 181* [json5-rs](https://github.com/callum-oakley/json5-rs) 182* [mt940](https://github.com/svenstaro/mt940-rs) 183* [Myoxine](https://github.com/d3bate/myoxine) 184* [py_literal](https://github.com/jturner314/py_literal) 185* [rouler](https://github.com/jarcane/rouler) 186* [RuSh](https://github.com/lwandrebeck/RuSh) 187* [rs_pbrt](https://github.com/wahn/rs_pbrt) 188* [stache](https://github.com/dgraham/stache) 189* [tera](https://github.com/Keats/tera) 190* [ui_gen](https://github.com/emoon/ui_gen) 191* [ukhasnet-parser](https://github.com/adamgreig/ukhasnet-parser) 192* [ZoKrates](https://github.com/ZoKrates/ZoKrates) 193* [Vector](https://github.com/timberio/vector) 194* [AutoCorrect](https://github.com/huacnlee/autocorrect) 195* [yaml-peg](https://github.com/aofdev/yaml-peg) 196* [qubit](https://github.com/abhimanyu003/qubit) 197* [caith](https://github.com/Geobert/caith) (a dice roller crate) 198* [Melody](https://github.com/yoav-lavi/melody) 199* [json5-nodes](https://github.com/jlyonsmith/json5-nodes) 200* [prisma](https://github.com/prisma/prisma) 201 202## Minimum Supported Rust Version (MSRV) 203 204This library should always compile with default features on **Rust 1.61.0**. 205 206## no_std support 207 208The `pest` and `pest_derive` crates can be built without the Rust standard 209library and target embedded environments. To do so, you need to disable 210their default features. In your `Cargo.toml`, you can specify it as follows: 211 212```toml 213[dependencies] 214# ... 215pest = { version = "2", default-features = false } 216pest_derive = { version = "2", default-features = false } 217``` 218 219If you want to build these crates in the pest repository's workspace, you can 220pass the `--no-default-features` flag to `cargo` and specify these crates using 221the `--package` (`-p`) flag. For example: 222 223```bash 224$ cargo build --target thumbv7em-none-eabihf --no-default-features -p pest 225$ cargo bootstrap 226$ cargo build --target thumbv7em-none-eabihf --no-default-features -p pest_derive 227``` 228 229## Special thanks 230 231A special round of applause goes to prof. Marius Minea for his guidance and all 232pest contributors, some of which being none other than my friends. 233