Giter Club home page Giter Club logo

verusfmt's Introduction

Verusfmt

An opinionated formatter for Verus code.

WARNING

verusfmt is highly experimental code. Make backups of your files before trying verusfmt on them.

Installing and Using Verusfmt

We support multiple install methods:

  • Latest release using:

    • Pre-built binary on Linux/MacOS (click to expand)
      curl --proto '=https' --tlsv1.2 -LsSf https://github.com/verus-lang/verusfmt/releases/latest/download/verusfmt-installer.sh | sh
    • Pre-built binary on Windows (click to expand)
      irm https://github.com/verus-lang/verusfmt/releases/latest/download/verusfmt-installer.ps1 | iex
    • cargo install (click to expand)
      cargo install verusfmt --locked
  • Bleeding-edge latest git commit

    • cargo install (click to expand)
      cargo install --git https://github.com/verus-lang/verusfmt --locked

These will install the verusfmt binary. You can then run it on a file using:

verusfmt foo.rs

See verusfmt --help for more options and details.

Goals

  1. Make it easier to read and contribute to Verus code by automatically formatting it in a consistent style (added bonus: eliminating soul-crushing arguments about style).
  2. Produce acceptably "pretty" output.
  3. Run fast! verusfmt may be run in pre-submit scripts, CI, etc., so it can't be slow.
  4. Keep the code reasonably simple. Pretty printers are notoriously hard, so we try to take steps to reduce that difficulty, so that verusfmt can be updated and adapted with a reasonable amount of effort.

FAQ

  1. Why not adapt rustfmt for Verus idioms?

    While Verus has Rust-like syntax, it necessarily also deviates from it to support its idioms naturally, and thus not only would the parser for rustfmt need updates, but also careful changes to the emitter would be needed to have code look natural. Additionally, since practically all Verus code is inside the verus!{} macro (and rustfmt does not easily support formatting even regular Rust inside macros), a non-trivial amount of effort would be required to perform the plumbing and maintenance required to support both formatting outside the verus!{} macro (as Rust code), while also formatting Verus code inside the macro.

  2. Does verusfmt match rustfmt on code outside the verus!{} macro?

    Yes, by default, verusfmt handles code inside the verus!{} macro, and also runs rustfmt to handle code outside the macro. Neither should clash with the other or override each other's formatting changes. Thus, this makes it easier to incrementally verify small amounts of code inside a larger unverified Rust crate. You can disable the invocation of rustfmt using --verus-only.

  3. Why not build this as a feature of Verus?

    By the time Verus receives an AST from rustc, we've already lost information about whitespace and comments, meaning that we couldn't preserve the comments in the reformatted code. Plus, firing up all of rustc just to format some code seems heavyweight.

Future Work

  • Special handling for commonly used macros, like println!, state_machine!, calc!
  • Enforce the Rust naming policy?

Non-Future Work

  • We currently have no plans to sort use declarations the way rustfmt does
  • We do not intend to parse code that Verus itself cannot parse. Sometimes verusfmt happens to parse such code, but that is unintentional and cannot be counted upon.
  • Perfectly match the formatting produced by rustfmt
  • Handle comments perfectly -- they're surprisingly hard!

Design Overview

Our design is heavily influenced by the Goals above. Rather than write everything from scratch (a notoriously hard undertaking), we use a parser-generator to read in Verus source code, and a pretty-printing library to format it on the way out. We try to keep each phase as performant as possible, and we largely try to keep the formatter stateless, for performance reasons but more importantly to try to keep the code reasonably simple and easy to reason about. Hence we sometimes deviate from Rust's style guidelines for the sake of simplicity.

Parsing

We define the syntax of Verus source code using this grammar, which is processed by the Pest parser generator, which relies on Parsing Expression Grammars (PEGs). It conveniently allows us to define our notion of whitespace and comments, which the remaining rules can then ignore; Pest will handle them implicitly. We explicitly ignore the code outside the verus! macro, leaving it to rustfmt. We prefer using explicit rules for string constants, since it allows a more uniform style when formatting the code. In some places, we have multiple definitions for the same Verus construct, so that we can format it differently depending on the context (see, e.g., attr_core). Many of the rules are designed to follow the corresponding description in The Rust Reference.

Formatting

Rather than try to format things ourselves, we rely on the pretty crate, based on Philip Wadler's design for a pretty printer. The core idea is that you create a set of possible code layouts, and the pretty printer then uses its internal heuristics to pick the prettiest version. Typically this means that we specify where, say, line breaks can occur if the code needs to be placed on multiple lines, but you can also use the group operator to say that for a particular code snippet, the pretty printer should also consider placing everying in the group on a single line.

As much as possible, we try to keep the formatter simple by arranging for the formatting of a node to be computed by simply formatting each of its children. Sometimes this requires splitting a node in the parser, so that we can format the same item in two different ways, depending on its context. Rust contexts can be tricky to track dynamically (since Rust allows expressions in statements, and statements in expression), so we try to keep the formatter stateless to reduce the scope for errors.

Contributing

We welcome contributions! Please read on for details.

We consider it a bug in verusfmt if you provide verusfmt with code that Verus accepts and verusfmt produces code that Verus does not accept or code that has different semantics from the original. When this happens, please open a GitHub issue with a minimal example of the offending code before and after formatting.

If verusfmt produces valid code but you dislike the formatting, please open a GitHub pull request with your proposed changes and rationale for those changes. Please ensure that existing test cases still pass (see below for more details), unless your goal is to change how some of those test cases are handled. Please also include new/updated tests that exercise your proposed changes.

Tips for Developing a Fix or an Improvement

  1. Write out a small example .rs file focusing on the specific thing you want to improve. Test verusfmt on your reduced example to make sure the issue still manifests; running with --check is very helpful with this process. The smaller your example file, the easier subsequent steps will be.
  2. When running with --check, you can also add -dd to see 2nd level debug output, which includes the parse tree. Look through the parse tree and identify rules relevant to your particular example.
  3. Once you find the relevant rule names, if the issue seems to be a misparse, jump over to src/verus.pest to find the relevant rule(s) and see if the grammar needs to be fixed or improved.
  4. If the parsing is fine but the printing is an issue, then look for the relevant rule(s) in the to_doc function in src/lib.rs. This might be a bit difficult to understand immediately, so having the pretty docs handy is quite helpful. Also, it is helpful to look at the relevant debug print (using -d or -dd), which gives a serialized version of the recursively expanded doc, right before it has been optimized, so figuring out which particular bit of it is not behaving as you like is quite helpful.
  5. Attempt fixes until the small example succeeds.
  6. Add the example into the tests---see the Testing section below.
  7. If fixing the rule for your small example succeeds but breaks other tests, you may need to split the relevant rule in the parsing grammar into two separate cases, so that each case can be formatted independently. See comma_delimited_exprs_for_verus_clauses and groupable_comma_delimited_exprs_for_verus_clauses, for example.

Testing

Rust-like formatting

In general, we try to adhere to Rust's style guide. Tests for such adherence live in tests/rustfmt-matching.rs. These tests will compare the output of rustfmt to that of verusfmt. You can run them via cargo test.

Verus-like formatting

In various places, we deviate from Rust's style, either to simplify the formatter or to handle Verus-specific syntax. Tests for formatting such code live in tests/verus-consistency.rs. You can add a new test or modify an existing one by writing/changing the input code. The test's correct answer is maintained via the Insta testing framework.

Insta recommends installing the cargo-insta tool for an improved review experience:

cargo install cargo-insta

You can run the tests normally with cargo test, but it's often more convenient to run the tests and review the results via:

cargo insta test
cargo insta review

or more succinctly:

cargo insta test --review

verusfmt's People

Contributors

jaybosamiya avatar parno avatar pratapsingh1729 avatar tjhance avatar utaal avatar utaal-b avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

verusfmt's Issues

`impl TRAIT for TYPE`

On branch pretty.

Before:

impl OwlSpecSerialize for owlSpec_t {
open spec fn as_seq(self) -> Seq<u8> { serialize_owlSpec_t(self) }
}

After:

impl OwlSpecSerializefor owlSpec_t {
    open spec fn as_seq(self) -> Seq<u8> {
        serialize_owlSpec_t(self)
    }
}

Note the space before the for keyword is gone.

Preservation of stanzas

It's a natural pattern to group lines into logical groupings. However, verusfmt collapses empty lines.

> cat verusfmt.rs
use vstd::prelude::*;

verus! {

fn f() {
    // First stanza.
    let a = 1;
    let b = a;

    // Second stanza.
    let c = a;
    let d = c;
}

} // verus!
> verusfmt --check verusfmt.rs
   0.054808000s ERROR Input found not to be well formatted
--- verusfmt.rs
+++ verusfmt.rs.formatted
@@ -6,7 +6,6 @@
     // First stanza.
     let a = 1;
     let b = a;
-
     // Second stanza.
     let c = a;
     let d = c;

Error:   × invalid formatting

Formatting of multi-line `seq!` macro

verusfmt 0.2.8 formats:

use vstd::prelude::*;

verus! {

proof fn f() {
    let s =
        seq![
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    ];
}

} // verus!

This occurs with two lines or more. With only one you get:

proof fn f() {
    let s = seq![
        0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    ];
}

Unnecessary newline in same-file `mod`ules

Consider

verus! { mod foo { fn foo() { } } }

It currently gets formatted to:

verus! {

mod foo {
    fn foo() {
    }

}

} // verus!

It would be preferable if it were formatted to

verus! {

mod foo {
    fn foo() {
    }
}

} // verus!

Wildcard type annotations on `let`-bindings are removed

Before:

use vstd::prelude::*;

verus! {

fn foo() {
    let x: _ = bar();
}

} // verus!

verusfmt output:

use vstd::prelude::*;

verus! {

fn foo() {
    let x:  = bar();
}

} // verus!

The wildcard type annotation on x is gone.

(This is minimized from a larger example in my codegen where the type annotation is on a tuple, but I want to annotate only one element of the tuple.)

Bug in inline comment handling after an attribute

Before:

#[verifier(external_body)] // comment breaks it! 
pub exec fn foo(a: usize) -> (b: usize)
    requires a > 5
    ensures b > 5
{
    a + 2
}

After:

#[verifier(external_body)]  // comment breaks it!pub exec fn foo(a: usize) -> (b: usize)
    requires
        a > 5,
    ensures
        b > 5,
{
    a + 2
}

The newline after the inline comment is gone.

Bug with macro calls

verus! {

pub exec fn foo() {
    let temp_owl__x23 = { let x: Vec<u8> = mk_vec_u8!(); x };
}

} // verus!

This breaks with the following error:

Error:  --> 4:58
  |
4 |     let temp_owl__x23 = { let x: Vec<u8> = mk_vec_u8!(); x };
  |                                                          ^---
  |
  = expected COMMENT, at_str, dot_str, dot_dot_str, dot_dot_eq_str, lbracket_str, question_str, semi_str, as_str, else_str, has_str, is_str, bin_expr_ops, or arg_list

It works fine if I make the mk_vec_u8!() into a normal function call, like mk_vec_u8(). Seems like something with the parser/grammar.

Non-idempotency for comment being moved to same line

Consider the following code:

verus! {
impl a {
    fn b()
    //
    ;
}
}

Formatting this is non-idempotent (it does settle to a steady state in 2 iterations though; the first time around, it just doesn't place enough spaces for some reason)

$ cargo run --release -- test.rs
    Finished release [optimized] target(s) in 0.08s
     Running `target/release/verusfmt test.rs`

$ cargo run --release -- --check test.rs
    Finished release [optimized] target(s) in 0.08s
     Running `target/release/verusfmt --check test.rs`
   0.045484917s ERROR Input found not to be well formatted
--- original
+++ formatted
@@ -1,7 +1,7 @@
 verus! {

 impl a {
-    fn b()//
+    fn b()  //
     ;
 }


Error: invalid formatting

It appears that the rules parsed/processed themselves seem to be the same in the --debug output:

$ # Remember to reset `test.rs` to the file at the start of this issue
$ cargo run --release -- --debug test.rs
    Finished release [optimized] target(s) in 0.07s
     Running `target/release/verusfmt --debug test.rs`
   0.044931166s  INFO Processing rule verus_macro_body
   0.044958041s  INFO Processing rule item
   0.044959375s  INFO Processing rule impl
   0.044960458s  INFO Processing rule impl_str
   0.044961666s  INFO Processing rule type
   0.044962541s  INFO Processing rule path_type
   0.044963333s  INFO Processing rule path
   0.044964083s  INFO Processing rule path_segment
   0.044964958s  INFO Processing rule name_ref
   0.044966000s  INFO Processing rule assoc_item_list
   0.044966833s  INFO Processing rule assoc_items
   0.044967791s  INFO Processing rule assoc_item
   0.044968583s  INFO Processing rule fn
   0.044969833s  INFO Processing rule fn_str
   0.044970750s  INFO Processing rule name
   0.044971583s  INFO Processing rule param_list
   0.045028083s  INFO Processing rule COMMENT
   0.045031583s  INFO Processing rule fn_qualifier
   0.045032875s  INFO Processing rule fn_terminator
   0.045033791s  INFO Processing rule semi_str

$ cargo run --release -- --debug --check test.rs
    Finished release [optimized] target(s) in 0.07s
     Running `target/release/verusfmt --debug --check test.rs`
   0.044528708s  INFO Processing rule verus_macro_body
   0.044565125s  INFO Processing rule item
   0.044566583s  INFO Processing rule impl
   0.044567833s  INFO Processing rule impl_str
   0.044569000s  INFO Processing rule type
   0.044569875s  INFO Processing rule path_type
   0.044570666s  INFO Processing rule path
   0.044571500s  INFO Processing rule path_segment
   0.044572458s  INFO Processing rule name_ref
   0.044573458s  INFO Processing rule assoc_item_list
   0.044574291s  INFO Processing rule assoc_items
   0.044575291s  INFO Processing rule assoc_item
   0.044576041s  INFO Processing rule fn
   0.044577333s  INFO Processing rule fn_str
   0.044578291s  INFO Processing rule name
   0.044579083s  INFO Processing rule param_list
   0.044644166s  INFO Processing rule COMMENT
   0.044647583s  INFO Processing rule fn_qualifier
   0.044648583s  INFO Processing rule fn_terminator
   0.044649375s  INFO Processing rule semi_str
   0.044755708s  INFO Found some differences in test.rs
   0.044770125s ERROR Input found not to be well formatted
--- original
+++ formatted
@@ -1,7 +1,7 @@
 verus! {

 impl a {
-    fn b()//
+    fn b()  //
     ;
 }


Error: invalid formatting

Report filename when formatting multiple files

verusfmt accepts multiple files on the command line. However, when one of them contains an error, we only report the line number, without reporting the filename, leaving it ambiguous where the error lies.

Non-idempotency of macro items in presence of in-file `mod`ules

verusfmt seems to want to keep moving macro items further to the right on each run

mod foo {
    verus! {

    bar! {
        baz
    }

    } // verus!
}
$ cargo run -- temp.rs
    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/verusfmt temp.rs`

$ cargo run -- --check temp.rs
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/verusfmt --check temp.rs`
   0.062335083s ERROR Input found not to be well formatted
--- temp.rs
+++ temp.rs.formatted
@@ -2,8 +2,8 @@
     verus! {

     bar! {
-            baz
-        }
+                baz
+            }

     } // verus!
 }

Error:   × invalid formatting

$ cargo run -- temp.rs
    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/verusfmt temp.rs`

$ cargo run -- --check temp.rs
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/verusfmt --check temp.rs`
   0.075480083s ERROR Input found not to be well formatted
--- temp.rs
+++ temp.rs.formatted
@@ -2,8 +2,8 @@
     verus! {

     bar! {
-                baz
-            }
+                    baz
+                }

     } // verus!
 }

Error:   × invalid formatting

Incorrect operator precedence for `&&&` and `forall`

We seem to have a misparse happening due to precedence of special bulleted ops and quantifiers. Minimized example is:

&&& forall|| f
&&& g

which parses as

- expr > expr_inner > prefix_expr
  - triple_and: "&&&"
  - expr > expr_inner > quantifier_expr
    - forall_str: "forall"
    - closure_param_list: "||"
    - expr
      - expr_inner > path_expr_no_generics > path_no_generics > path_segment_no_generics > name_ref > identifier: "f"
      - bin_expr_ops > bin_expr_ops_special > triple_and: "&&&"
      - expr > expr_inner > path_expr_no_generics > path_no_generics > path_segment_no_generics > name_ref > identifier: "g"

Instead, the g should not be considered in the body of the forall.


Original report by @jialin-li:

verusfmt's interaction with bulleted & and forall looks a little misleading, this is the result post formatting

        &&& forall|au|
            #![auto]
            deallocs.contains(au) ==> self.mini_allocator.can_remove(au)
            &&& post == EphemeralState {
                mini_allocator: self.mini_allocator.prune(deallocs),
                ..self
            }
            &&& new_dv == dv

this is also true for exists

||| exists|cut, addr|
            self.is_marshalled_state(dv, allocs, deallocs, cut, addr, post, new_dv)
            ||| self.is_allocator_fill(dv, allocs, deallocs, post, new_dv)
            ||| self.is_allocator_prune(dv, allocs, deallocs, post, new_dv)
            ||| self.is_no_op(dv, allocs, deallocs, post, new_dv)

Handling inline comments outside the main verus-body

The main parse_and_format has special handling for top-level comments and items, but this means those items don't benefit from/share code with the main formatting logic in to_doc. For example, when we process this code:

verus! {

pub type ReplicaId = usize; // $line_count$Trusted$

} // verus!

It gets parsed as a top-level type_alias item followed by a comment that gets treated as a top-level suffix comment in parse_and_format, which means we don't notice that it's been flagged as an inline comment. We could replicate that logic in parse_and_format, but I suspect we should instead try to simplify parse_and_format and rely more consistently on to_doc.

Non-idempotent `verus!` inside `macro_rules!`

macro_rules! a {
    () => {
        verus! {
        }
    }
}
$ cargo run -- -Z idempotency-test foo.rs
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/verusfmt -Z idempotency-test foo.rs`
   0.115861292s ERROR 😱Formatting found to not be idempotent😱
--- foo.rs.formatted-once
+++ foo.rs.formatted-twice
@@ -1,6 +1,6 @@
 macro_rules! a {
     () => {
         verus! {
-                }
+                        }
     };
 }

Parser fails on lifetimes in the parameter list and return type

This example:

fn get(&'a self) -> (o: &'a V) {
    0
}

fails with:

1 | fn get(&'a self) -> (o: &'a V) {
  |                           ^---
  |
  = expected COMMENT, colon_str, dot_dot_eq_str, dot_dot_str, or pipe_str

but the following work:

fn get(&'a self) -> (o: V) {
    0
}
fn get(a:int) -> (o: &'a V) {
    0
}
fn get(&'a self, b:int) -> (o: &'a V) {
    0
}

And on its own, -> (o: &'a V) is correctly parsed as a ret_type.

This would be easier to debug if we could get a partial parse tree out of Pest when it fails.

Add support for locally opting out of verusfmt

In some (hopefully rare) cases, the developer has a better idea of how a term should be formatted. For example, here:

fn pow2() {
    assert(
        pow2(0) == 0x1 &&
        pow2(1) == 0x2 &&
        pow2(2) == 0x4 &&
        pow2(3) == 0x8 &&
        pow2(4) == 0x10 &&
        pow2(5) == 0x20 &&
        pow2(6) == 0x40 &&
...
        pow2(32) == 0x100000000 &&
        pow2(64) == 0x10000000000000000
   ) by(compute_only);
}

The formatting is intended to make it clear how the exponents are progressing, whereas verusfmt tries to fit as many terms as it can on each line. One way to support such cases is via functionality similar to rustfmt's #[rustfmt::skip] attribute.

`verusfmt` doesn't handle borrows of identifiers that start with the characters `mut`

Input:

verus! {
pub exec fn foo(mut_state: &mut Blah) {
    let a = { b(&mut_state.c) };
}
} // verus!

Error:

Error:  --> 7:27
  |
7 |     let a = { b(&mut_state.c) };
  |                           ^---
  |
  = expected COMMENT, bang_str, colons_str, generic_arg_list, or record_expr_field_list

The error goes away and the code is correctly formatted if I do any of the following:

  • remove the borrow: let a = { b(mut_state.c) };
  • remove the characters mut_: let a = { b(&state.c) };
  • prepend something before mut: let a = { b(&xmut_state.c) };

I would guess this is something to do with the parsing of &mut? rustfmt is able to handle the analogous Rust code:

pub fn foo(mut_state: &mut Blah) {
    let a = { b(&mut_state) };
}

Better handle CRLF normalization

If we format a properly-formatted file that has Windows newlines, then verusfmt replaces them with Unix newlines (at least on a Unix system (possibly most confusingly with WSL?)); however, this may cause confusion on verusfmt --check since literally every line will look visually identical in the diff. It would be nice if we can detect this as the reason, and provide a better error message rather than dump the whole file in the -/+ of the diff.

Core issue hit originally by @jaylorch, likely due to git's handling of newline changes.

Mis-parsing requires clauses as generic arguments

In this program:

verus! {

fn test()
    requires i < 0, len > 0,
{
}    

} // verus!

We parse the requires clause as the path_expr i<0,len> followed by the unexpected token 0, rather than the intended meaning of two binary expressions, namely i < 0 and len > 0. I don't immediately see an easy fix.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.