-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Update/clarify documentation of generic constants #5473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
struct SymbolicConstant : Printable<SymbolicConstant> { | ||
// The constant instruction that defines the value of this symbolic constant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also use the "defines the value of a constant" terminology in the definition of the InstConstantKind
values, so if you're changing it here, can you change it there too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Note that I made a couple of drive-by fixes while I was there.
// This instruction represents a symbolic or concrete constant whenever its | ||
// operands are constant. Otherwise, it is non-constant. For example, a | ||
// `TupleValue` defines a constant if and only if its operands are constants. | ||
// Constant evaluation support for types with this constant kind is provided | ||
// automatically. | ||
// `TupleValue` represents a constant if and only if its operands represent | ||
// constants. Constant evaluation support for types with this constant kind is | ||
// provided automatically. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the pre-existing description here is wrong. How about this:
// This instruction represents a symbolic or concrete constant whenever its | |
// operands are constant. Otherwise, it is non-constant. For example, a | |
// `TupleValue` defines a constant if and only if its operands are constants. | |
// Constant evaluation support for types with this constant kind is provided | |
// automatically. | |
// `TupleValue` represents a constant if and only if its operands represent | |
// constants. Constant evaluation support for types with this constant kind is | |
// provided automatically. | |
// This instruction can represent a symbolic or concrete constant, and has a | |
// constant value of the same kind whenever its operands are constant. | |
// Otherwise, it is non-constant. For example, a `TupleValue` has a constant | |
// value if and only if its operands have constant values, and its constant | |
// value is always represented by a `TupleValue` instruction. Constant | |
// evaluation support for types with this constant kind is provided | |
// automatically. |
// This instruction always represents a constant value. This is the same as | ||
// `WheneverPossible`, except that the operands are known in advance to always | ||
// be constant. For example, `IntValue`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one was right before, and the changes are incorrect (probably because the description of WheneverPossible
was wrong).
// This instruction always represents a constant value. This is the same as | |
// `WheneverPossible`, except that the operands are known in advance to always | |
// be constant. For example, `IntValue`. | |
// This instruction always has a constant value of the same kind. This is the | |
// same as `WheneverPossible`, except that the operands are known in advance | |
// to always be constant. For example, `IntValue`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a guess about where the disconnect might be: the original version would be correct, and my new version would be wrong, if we add the restriction that only a canonicalized inst can represent a constant value. Is that the issue? If so, I've intentionally tried to avoid that restriction so that these can be self-contained properties of inst values (or more precisely inst graphs), whereas canonicalization is a non-local property of IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see where you're coming from. But I don't think that matches well with the model and terminology that we're using.
Whether particular inst IDs are the canonical instruction that defines a constant value (that is, whether constant_values().GetConstantInstId(inst_id) == inst_id
) is a fundamental part of what we're trying to document here. Moreover, when we talk about instructions, we're generally talking about a thing with identity that lives in a SemIR::File
, not a SemIR::Inst
object (or a typed inst object) -- the latter are probably best thought of more as potential instructions or views of instructions, and the actual instructions are the things in the inst value store. In particular, instructions have IDs and constant values and such, which are not part of the Inst
value.
I think the new version of this comment is also wrong even with your approach of using "represents a constant value" to mean "instruction has the right value that it could be the canonical representation of its constant value" -- just because the operands are constant (have constant values), doesn't mean that they are represented by canonical IDs, which is what we need for the instruction to be in its canonical representation. For example, an ArrayType
whose bound_id
is the expression 1 + 1
does not "represent a constant value" because it's not in the proper form to represent a constant value (it's not a "constant inst" in the terminology you used in constant.h
), but we use Always
for ArrayType
because it always has a constant value of the same kind (in this case, an ArrayType
whose bound_id
is the IntValue
2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see where you're coming from. But I don't think that matches well with the model and terminology that we're using.
If so, I can revert to the previous version, but I could use some clarification on a couple of points.
Whether particular inst IDs are the canonical instruction that defines a constant value (that is, whether
constant_values().GetConstantInstId(inst_id) == inst_id
) is a fundamental part of what we're trying to document here.
How so? None of the places where we use this enum seem to care about that property.
Moreover, when we talk about instructions, we're generally talking about a thing with identity that lives in a
SemIR::File
, not aSemIR::Inst
object (or a typed inst object) -- the latter are probably best thought of more as potential instructions or views of instructions, and the actual instructions are the things in the inst value store. In particular, instructions have IDs and constant values and such, which are not part of theInst
value.I think the new version of this comment is also wrong even with your approach of using "represents a constant value" to mean "instruction has the right value that it could be the canonical representation of its constant value" -- just because the operands are constant (have constant values), doesn't mean that they are represented by canonical IDs, which is what we need for the instruction to be in its canonical representation.
For what it's worth, the model I had in mind wasn't "the instruction value could be the canonical representation of its constant value", it was more like "the canonicalized instruction value could be the canonical representation of its constant value", where canonicalization consists of recursively replacing IDs with their canonical counterparts, but not performing any evaluation.
For example, an
ArrayType
whosebound_id
is the expression1 + 1
does not "represent a constant value" because it's not in the proper form to represent a constant value (it's not a "constant inst" in the terminology you used inconstant.h
), but we useAlways
forArrayType
because it always has a constant value of the same kind (in this case, anArrayType
whosebound_id
is theIntValue
2).
It looks to me like ArrayType
is Conditional
, not Always
, and in fact it's the example we use above to explain Conditional
. From a quick skim, I'm not seeing an obvious way to transpose your example to any of the Always
inst kinds -- they all look like their operands are fully-evaluated by construction. So it seems like the inst kinds that use this enum might be following a definition closer to mine, where Always
means that no further evaluation is needed.
On the other hand, the code that currently reads Always
seems to be following a definition closer to yours: if I understand correctly, TryEvalTypedInst
will recursively evaluate the operands of an Always
inst; it just doesn't then go on to do any kind-specific evaluation logic. However, that does suggest that the original documentation was a little imprecise: it's not just that the inst always has a constant value of the same kind, it's that its constant value can be computed by evaluating and canonicalizing each operand. However, in practice that may be a distinction without a difference.
Side musing: I wonder if it would be cleaner to factor this into two separate enums: one that chooses the TryEvalTypedInst
algorithm, and one that's responsible for specifying whether insts can be/have a constant value. For example:
enum class EvalAlgorithm {
// Always returns NotConstant.
// Corresponds to `None`.
NoEval,
// Always returns the inst_id as a constant.
// Corresponds to `Unique`
Unique,
// Replaces operands with their canonicalized constant values.
// Corresponds to `Always` and `WheneverPossible`
ReplaceOnly,
// Uses PerformDelayedAction.
// Corresponds to `InstAction`
DelayedAction,
// Uses 3-argument EvalConstantInst.
// Corresponds to the subset of `Indirect`, `SymbolicOnly` and `Conditional`
// inst kinds that have constant_needs_inst_id == true.
CustomWithId,
// Uses 2-argument EvalConstantInst.
// Corresponds to the subset of `Indirect`, `SymbolicOnly` and `Conditional`
// inst kinds that have constant_needs_inst_id == false.
CustomWithoutId,
};
enum class InstConstantValueKind {
// Corresponds to `Never`.
Never,
// Corresponds to `Indirect`.
CanHaveConstantValue,
// Corresponds to `SymbolicOnly`, `InstAction`, `Conditional`, and `WheneverPossible`.
CanBeConstantValue,
// Corresponds to `Always`.
MustHaveConstantValue,
// Corresponds to `Unique`.
MustBeConstantValue,
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether particular inst IDs are the canonical instruction that defines a constant value (that is, whether
constant_values().GetConstantInstId(inst_id) == inst_id
) is a fundamental part of what we're trying to document here.How so? None of the places where we use this enum seem to care about that property.
I think the place where this matters the most is for Unique
: the key property that Unique
gives is that constant_values().GetConstantInstId(inst_id) == inst_id
for every instruction of that kind. On reflection, it might not be such a big deal for other instructions, except that it's a part of establishing their canonical form.
For what it's worth, the model I had in mind wasn't "the instruction value could be the canonical representation of its constant value", it was more like "the canonicalized instruction value could be the canonical representation of its constant value", where canonicalization consists of recursively replacing IDs with their canonical counterparts, but not performing any evaluation.
Oh, OK. For me, the new terminology isn't communicating that well -- I find it unclear to describe an instruction as "represent[ing] a constant" if that instruction representation can never be the representation that's used for a constant because it has non-canonical field values -- but maybe a different term would work better with this same style of approach. Maybe a good term (eg, from lambda calculus / term rewrite systems) to use would be to talk about whether an expression is in reduced form or not.
That said, the process by which we evaluate is to first canonicalize the fields and then attempt to reduce the instruction, so talking about instructions that are reduced but not canonical doesn't really match the evaluator's logic. But maybe nonetheless separating canonicalization from reduction here would make the explanation clearer.
It looks to me like
ArrayType
isConditional
, notAlways
, and in fact it's the example we use above to explainConditional
.
Oops, I guess I saw the InstIsType::Always
and thought it was the InstConstantKind
!
From a quick skim, I'm not seeing an obvious way to transpose your example to any of the
Always
inst kinds -- they all look like their operands are fully-evaluated by construction.
I think ClassType
is an example where this happens -- hopefully a correct one this time! If we form a ClassType
with a non-canonical SpecificId
, evaluation will replace that with a corresponding canonical SpecificId
(with the arguments canonicalized) when forming the canonical ClassType
.
Side musing: I wonder if it would be cleaner to factor this into two separate enums: one that chooses the
TryEvalTypedInst
algorithm, and one that's responsible for specifying whether insts can be/have a constant value.
I think probably the best test for that kind of approach would be to see what impact it has on the consumers of this information. It appears to me that the two choices aren't orthogonal, and I'm worried it'd add complexity to write the code in a way that either tries to support all the combinations or, say, static_assert
s on invalid combinations, when many of them never come up or don't make sense.
No description provided.