-
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add witness template to lints #893
Conversation
Let's keep While generating witness code, we may need to run an additional query to collect contextual information that is only applicable to the witness. For example, say we claim that a function definition has gained an extra argument. The witness generator needs to know:
A valid witness may look like this: fn witness<T: Debug>(arg_1: i64, arg_2: std::sync::Arc<T>) {
let _ = broken_function(arg_1, arg_2);
} or it may look like this: fn witness() {
broken_function();
} and we need the machinery to distinguish between the two.
To start, yes, I think this should be a separate, explicitly permanently-unstable opt-in flag. We can tackle stabilizing it in the future, once we're happy with it.
Great question. I'm not sure. We'll have to come up with a few different UI layouts, then see which works best with our users.
Ideally, we should aim to generate valid Rust that we can compile and prove gets affected by the breakage we identified (e.g. it produces a compile error, or triggers new UB, or some other criterion we can readily point to). Like I mentioned earlier, the witness and the "hint" we show in the UI don't have to be the same. Once we're generating full witnesses, we should have a way to dump them into files/temp projects individually, and have rustc validate the expected breakage does indeed occur. This would allow us to replicate the SemVer study from last year, at hopefully a fraction of the human labor cost. |
src/lints/function_missing.ron
Outdated
@@ -46,4 +46,8 @@ SemverQuery( | |||
}, | |||
error_message: "A publicly-visible function cannot be imported by its prior path. A `pub use` may have been removed, or the function itself may have been renamed or removed entirely.", | |||
per_result_error_template: Some("function {{join \"::\" path}}, previously in file {{span_filename}}:{{span_begin_line}}"), | |||
witness_template: Some( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a fine starting point, but I wouldn't call this a witness by itself since it won't compile. It's more of a hint toward one. So maybe hint_template
or something similar would be a more appropriate name?
I think eventually we'll want to expand upon this to make it easier to generate full witnesses that can generate real working code. While we don't have to implement all that today, it would be lovely to have a sense of the broader design within which today's changes should fit.
For example, I'd recommend adding an entire struct here, with the hint being just one of its fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a fine starting point, but I wouldn't call this a witness by itself since it won't compile. It's more of a hint toward one. So maybe hint_template or something similar would be a more appropriate name?
It's not a witness because it's not an item, so it just needs to be wrapped in a function in this case? What other properties would a witness need to have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also elides the arguments to the function by making the function call will_be_removed_fn(...);
The tension here is this:
- The full witness is needed since it's the ultimate correctness check.
- But the full witness is usually too onerous and heavyweight to show to users inline with regular CLI output. A rustdoc-style (not wrapped in a function, etc.) hint for how the witness works would be preferable in the CLI.
- The CLI output could also provide text to the effect of "we generated the full witness in this temp file, you can go check it out if you want to verify it's valid."
Zooming out — I'm very excited to be moving toward making it possible to produce witness code here! A lot of the rules we check are obscure and complex, and it's not unreasonable to assume that for some of our users, our lints would be the first time they encounter some of those concepts. Saying "your trait isn't object safe" is much more opaque and less user-friendly than saying "look, So I love this, and I'm excited to see it move forward! |
Not at computer right now, but something like: #[derive(Default)]
struct Witness {
hint_template: Option<String>,
witness_template: Option<String>,
witness_query: Option<Query>
}
struct Query {
query: String,
values: BTreeMap<String, String>
} (names notwithstanding and not sure of the value type for query.values). where if the query is Some, we run it on the crate diff and provide the values to the witness_template. |
I think that's a great place to start. I'd rename The one other thing we should consider is that right now, there are cases where we won't be able to generate a witness. For example, for calls to functions whose args changed, we can't currently generate a witness if the function had any arguments (we can't look up their types), or was generic. It would be good to figure out how the "we can't generate a witness in this specific case" gets detected, acted upon, and propagated. |
Updated struct and documented the members. I used Design questions that came up, particularly about the additional query:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on using TransparentValue
instead!
Should hint_template also be provided the additional query? (I think it would be good for efficiency if we could write hints such that they don't need additional parameters, but I don't know if that's possible).
I'd argue "no" here. If the hint needs something else output, the main query should output it.
Should the additional query itself be provided the results of the original SemverQuery? I would say yes, because otherwise we would have to run each query twice when there is a witness query, but this may complicate things e.g., if the witness query does not use all the @output directives (I know this is an error if the SemverQueries have unused arguments, but I don't know where this validation occurs)
Probably yes. But it might be good to be explicit about which results from the original query it wants.
Perhaps with some serde magic we could specify the query arguments like:
{
"some_var": "custom value",
"inherited": Inherit("output_from_other_query"),
}
What do you think about this approach, assuming it's doable? If you think it's good, let's see if it can be done in reasonable fashion.
How should @outputs be handled when there is one of the same name in the original query and in the additional query. It could be an error, or the additional query could take precedence (I wrote the latter in the docs, but we could also do the other one).
Probably safer to make it error, I think. I don't see a lot of value in allowing shadowing, whereas it could result in some unfortunate debugging if it happens by accident.
When should we generate witnesses? Should c-s-c show hints by default but opt-in to generate (and test compilation of?) full witness programs? This is my initial choice, but it would depend on whether we run additional queries for hints, because we don't want to slow down c-s-c by default.
Showing hints by default and generating/testing witnesses when explicitly asked sounds right to me.
src/query.rs
Outdated
/// more information. Witness implements `Default` for a query that doesn't generate any | ||
/// witness code. | ||
#[serde(default)] | ||
pub witness: Witness, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we keep serde(default)
and make this Option<Witness>
. Otherwise we might end up in a weird situation where we have no hint text (so nothing to show the user) even though we can generate a full witness example. We'd like weird states like that to be unrepresentable.
I'd make this Option<Witness>
then make the hint required but the other fields inside still be Option
.
/// Notice how this is not a compilable example, but it provides a distilled hint to the user | ||
/// of how downstream code would break with this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your doc comments game continues to be excellent 🚀
src/check_release.rs
Outdated
|
||
config.log_info(|config| { | ||
config.shell_note("the following downstream code would break:")?; | ||
writeln!(config.stderr(), "{message}")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried about readability here. It's too early to fuss over spacing and the exact CLI output format, but just wanted to flag this for us to revisit as a polish item before shipping this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of spacing between items, or the messages themselves (or both?) I did get syntax highlighting (on the other branch) working for code snippets so that could help for the snippets themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing, mostly, both between the note and the code, and possible indentation of the hint code. We want to do our best to be readable but concise, and match cargo/rustc style to make merging into cargo easier.
It'll be easier to revisit this later once we have some more witnesses that we're showing. For now, let's leave them hidden behind some permanently-unstable CLI flag (e.g. -Zunstable-options --show-witness-hints
) so we can iterate freely?
Perhaps adding the infrastructure for such unstable flags would be a good PR in and of itself?
Added I'm renaming the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot, well done! 🚀 I think it's safe to start working on a few more witnesses for different lints as stacked PRs, since I don't think our lint file's format will change much from here.
I'm planning to cut a new release next week, and it would be a bit safer if we hid this feature behind a permanently-unstable flag similar to how cargo/rustdoc/rustc etc. hide work-in-progress features. Would you be open to setting up a similar mechanism for us? We'd probably want to mirror their approaches quite closely, down to whether the permanently-unstable flags show up in --help
output or not.
src/check_release.rs
Outdated
|
||
config.log_info(|config| { | ||
config.shell_note("the following downstream code would break:")?; | ||
writeln!(config.stderr(), "{message}")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spacing, mostly, both between the note and the code, and possible indentation of the hint code. We want to do our best to be readable but concise, and match cargo/rustc style to make merging into cargo easier.
It'll be easier to revisit this later once we have some more witnesses that we're showing. For now, let's leave them hidden behind some permanently-unstable CLI flag (e.g. -Zunstable-options --show-witness-hints
) so we can iterate freely?
Perhaps adding the infrastructure for such unstable flags would be a good PR in and of itself?
src/lints/function_missing.ron
Outdated
@@ -46,10 +46,9 @@ SemverQuery( | |||
}, | |||
error_message: "A publicly-visible function cannot be imported by its prior path. A `pub use` may have been removed, or the function itself may have been renamed or removed entirely.", | |||
per_result_error_template: Some("function {{join \"::\" path}}, previously in file {{span_filename}}:{{span_begin_line}}"), | |||
witness: ( | |||
hint_template: Some( | |||
witness: Some(( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you try serde(flatten)
in addition to serde(default)
here? It might make the Some
unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add it and it broke deserialization of the outer SemverQuery
field 🫤
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible to use ron's implicit_some
option to be able to specify it without the Some
, but this would allow (but not require) this behavior on all Option<T>
in the SemverQuery
. branch link Is that something we would want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh neat, TIL about that extension. I think that's fine, yeah. Maybe good for a separate PR we can merge immediately?
src/query.rs
Outdated
@@ -257,7 +254,7 @@ pub struct Witness { | |||
/// Notice how this is not a compilable example, but it provides a distilled hint to the user | |||
/// of how downstream code would break with this change. | |||
#[serde(default)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might not need #[serde(default)]
here since this should always be specified (given the outer option wasn't None
) I think, right?
src/query.rs
Outdated
/// Represents either a value inherited from a previous query, or a | ||
/// provided constant value. | ||
#[derive(Debug, Clone, Serialize, Deserialize)] | ||
pub enum InheritedValue { | ||
/// Inherit the value from the previous output whose name is the given `String`. | ||
Inherited(String), | ||
/// Provide the constant value specified here. | ||
#[serde(untagged)] | ||
Constant(TransparentValue), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, this is exactly what I was hoping for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never mind 😅 it seems like ron
has known limitations with untagged
/deserialize_any
and having nested untagged
makes it so the deserialization doesn't work 😔. Maybe we have a list (or map of new_name -> old_name) of inherit_arguments
along with a map of arguments
? Then it's an error if the maps share keys. Alternately, we could remove the untagged
so users would have to express arguments as Inherit("...")
or Constant(...)
, which prevents duplicate keys from being expressed but is more verbose and less consistent with the arguments
top-level field.
round trip
println!(
"{:?},
ron::from_str::<InheritedValue>(
&ron::to_string(&InheritedValue::Inherited("abc".into())).unwrap()
),
);
Ok(Constant(List([String("abc")])))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, that's unfortunate 😞 What do you think is the better choice between the list/renaming map and the explicit Inherit(..)
vs Constant(..)
?
If you feel it's worth writing a few more witness sections before deciding, that's fine by me. We can go the easier route now and revisit based on your experience of using it in the real world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would prefer the explicit one. I also thought of another idea that might work. It's more jsony than rusty but it does remove the requirement to specify Constant(...)
:
arguments: {
"abc": (inherit: "abc"),
"string": "literal_string",
"int": -30,
}
where we specify inheriting as a one-field struct (inherit: "name")
or verbosely, Inherited(inherit: "name")
. How do you feel about this. It won't have any conflicts (I think) with TransparentValue
because maps and structs in ron
have different syntax, so Inherit
and Constant
variants of the enum should be disjoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a test that shows that the deserialization is what we expect.
That seems reasonable! Can you just check that lists work too?
…On Sat, Aug 31, 2024, 2:16 PM Max Carr ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/query.rs
<#893 (comment)>
:
> +/// Represents either a value inherited from a previous query, or a
+/// provided constant value.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub enum InheritedValue {
+ /// Inherit the value from the previous output whose name is the given `String`.
+ Inherited(String),
+ /// Provide the constant value specified here.
+ #[serde(untagged)]
+ Constant(TransparentValue),
}
I pushed a test that shows that the deserialization is what we expect.
—
Reply to this email directly, view it on GitHub
<#893 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR5MSXXDGJMZI3V3V5FWHDZUICATAVCNFSM6AAAAABNK6B6Z2VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDENZUGEYTCMRWHE>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Added list to test and it works! 👍 |
src/query.rs
Outdated
@@ -800,6 +799,7 @@ mod tests { | |||
"abc": (inherit: "abc"), | |||
"string": "literal_string", | |||
"int": -30, | |||
"list": [-30, -2, "abc"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heterogeneous lists like this aren't something Trustfall supports. This test case might make future refactoring harder if e.g. the TransparentValue
type changes in a way that disallows that at the type level — the person doing it will have to also fix this test case too. It's usually not a good idea to exceed the bounds of the spec, because if the implementation of the spec starts enforcing the bounds more strongly then breakage ensues.
I'd recommend removing the string here, and making a separate list entry with a string or two to test strings inside the list.
All looks good other than the tiny nitpick on the test structure! Btw, is the failing test accurate in saying that we print the error template to stdout and the hint to stderr? That seems strange and probably not desirable. Might be time to look into getting our use of stdout/stderr to be more consistent, or at least maybe introducing the ability to print notes to either stdout/stderr as needed in the caller's context. |
c0d11b5
to
890dd41
Compare
Rebased on #897 and fixed test. |
I think any of the |
I think this is the first time we've used I think it's confusing to direct the witness to stderr while the lint output is on stdout. I think the witness info should be on stdout too, so the user doesn't have to figure out how to interleave the stdout/stderr outputs together to figure out what each witness was referring to.
Probably a good idea to test with the witnesses present, even though it's an unstable feature. We still want to know if we broke something, or else we'll have a really hard time stabilizing it. |
- update behavior for stable flags and test - move bugreport action to the top - add test that all unstable features are hidden
Co-authored-by: Predrag Gruevski <[email protected]>
8824224
to
9c01217
Compare
Added configuration to indicate in a Todos:
|
Added test harness and moved note to stdout, so this should be ready for review. |
( | ||
filename: "src/lib.rs", | ||
begin_line: 4, | ||
hint: "use feature_flags_validation::foo_becomes_gated;\nfoo_becomes_gated(...);", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's mentioned in the comment in the test code, but prettier (multiline) strings are blocked on the release of ron v0.9 (which although there is an alpha, doesn't seem to be happening anytime soon due to maintainer overburden). it may be worth (albeit hacky), doing a str.replace("\\n", "\n")
after serializing to a string for better diffs.
filename: filename.to_string(), | ||
begin_line, | ||
hint: registry | ||
.render_template(&witness.hint_template, values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once we have the infrastructure to run witness queries and render full witnesses, code here should call an encapsulated version of that logic to generate the full witness here, if it is Some
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a TODO note to our future selves about it.
@@ -627,6 +627,10 @@ impl From<CheckRelease> for cargo_semver_checks::Check { | |||
check.set_build_target(build_target); | |||
} | |||
|
|||
let mut witness_generation = WitnessGeneration::new(); | |||
witness_generation.show_hints = value.unstable_options.witness_hints; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could also add setters for these instead of pub fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have strong feelings about it, this is fine for now.
.bold(); | ||
writeln!( | ||
config.stdout(), | ||
"{note}note:{note:#} downstream code similar to the following would break:\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here i just inlined the note code. we can't have a method in GlobalConfig
that takes an impl Write
and renders a note/error/warning/etc., because .stdout
/.stderr
mutably borrows the whole GlobalConfig
struct. we could:
- make (associated) functions
- implement them on the streams themselves (using a newtype)
- revisit using a crate like
annotate-snippets
to render for us - make macros
if we want a longer-term solution, if you think it is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revisiting with something like annotate-snippets
down the line would be great. This is a reasonable solution for the meantime — it doesn't make sense to hold up witness generation over CLI formatting.
@obi1kenobi btw this one is ready for review when you get a chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Making some trivial tweaks and merging. Also leaving an action item for you for a subsequent PR on updating contributing docs to explain the witness section.
.bold(); | ||
writeln!( | ||
config.stdout(), | ||
"{note}note:{note:#} downstream code similar to the following would break:\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revisiting with something like annotate-snippets
down the line would be great. This is a reasonable solution for the meantime — it doesn't make sense to hold up witness generation over CLI formatting.
@@ -627,6 +627,10 @@ impl From<CheckRelease> for cargo_semver_checks::Check { | |||
check.set_build_target(build_target); | |||
} | |||
|
|||
let mut witness_generation = WitnessGeneration::new(); | |||
witness_generation.show_hints = value.unstable_options.witness_hints; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have strong feelings about it, this is fine for now.
.with_context(|| format!("Query {query_name} has a witness template, but could not load\n\ | ||
`test_outputs/witnesses/{query_name}.output.ron` expected witness file, did you forget to add it?")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a separate PR, could you amend our contributing docs for adding new lints to add an optional step for "add a witness" and add a small section explaining how to do it?
I'd love to see a before-and-after screenshot of the difference if you get a chance! I'm open to adding a dependency if it results in notable improvement in user experience, and syntax highlighting certainly could be it. |
Ah unfortunately this has a merge conflict due to the priority logic for lints that merged yesterday. Sorry about that! |
This adds a new optional field
witness_template
to each lint that can be rendered to make a minimal downstream witness package to show how this change could break downstream code.Design questions:
--witness
flag or something similar?Todo list for me:
#[serde(default)]
make_new_lint.sh
- should this beSome("TODO")
orNone
- not every lint may have a possible witnessbat
likecargo-expand
- branch link. Do we want this? It adds some dependencies.