Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Comb] Crash in AndOp folder #8024

Open
teqdruid opened this issue Jan 2, 2025 · 9 comments
Open

[Comb] Crash in AndOp folder #8024

teqdruid opened this issue Jan 2, 2025 · 9 comments
Assignees
Labels
Comb Involving the `comb` dialect

Comments

@teqdruid
Copy link
Contributor

teqdruid commented Jan 2, 2025

In a237db5:

hw.module @MMIOAxiReadWriteDemux(in %data_valid : i1, out sel_ready : i1) {
  %0 = comb.and bin %2, %data_valid : i1
  %2 = comb.and bin %data_valid, %2 : i1
  hw.output %0 : i1
}
> $ circt-opt crash.mlir --canonicalize
circt-opt: /home/jodemme/circt/llvm/llvm/include/llvm/ADT/STLExtras.h:1289: ReferenceT llvm::detail::indexed_accessor_range_base<mlir::OperandRange, mlir::OpOperand *, mlir::Value, mlir::Value, mlir::Value>::operator[](size_t) const [DerivedT = mlir::OperandRange, BaseT = mlir::OpOperand *, T = mlir::Value, PointerT = mlir::Value, ReferenceT = mlir::Value]: Assertion `Index < size() && "invalid index for value range"' failed.
PLEASE submit a bug report to https://github.com/llvm/circt and include the crash backtrace.
Stack dump:
0.      Program arguments: circt-opt crash.mlir --canonicalize
 #0 0x000055e35902849d llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/jodemme/circt/llvm/llvm/lib/Support/Unix/Signals.inc:723:11
 #1 0x000055e35902898b PrintStackTraceSignalHandler(void*) /home/jodemme/circt/llvm/llvm/lib/Support/Unix/Signals.inc:798:1
 #2 0x000055e3590269f6 llvm::sys::RunSignalHandlers() /home/jodemme/circt/llvm/llvm/lib/Support/Signals.cpp:105:5
 #3 0x000055e3590290c5 SignalHandler(int) /home/jodemme/circt/llvm/llvm/lib/Support/Unix/Signals.inc:413:1
 #4 0x00007f6b3a444420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #5 0x00007f6b39dc800b raise /build/glibc-LcI20x/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #6 0x00007f6b39da7859 abort /build/glibc-LcI20x/glibc-2.31/stdlib/abort.c:81:7
 #7 0x00007f6b39da7729 get_sysdep_segment_value /build/glibc-LcI20x/glibc-2.31/intl/loadmsgcat.c:509:8
 #8 0x00007f6b39da7729 _nl_load_domain /build/glibc-LcI20x/glibc-2.31/intl/loadmsgcat.c:970:34
 #9 0x00007f6b39db8fd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
#10 0x000055e359062d88 llvm::detail::indexed_accessor_range_base<mlir::OperandRange, mlir::OpOperand*, mlir::Value, mlir::Value, mlir::Value>::operator[](unsigned long) const /home/jodemme/circt/llvm/llvm/include/llvm/ADT/STLExtras.h:0:5
#11 0x000055e3594accdf circt::comb::AndOp::fold(circt::comb::AndOpGenericAdaptor<llvm::ArrayRef<mlir::Attribute> >) /home/jodemme/circt/lib/Dialect/Comb/CombFolds.cpp:880:12
#12 0x000055e3595000c0 llvm::LogicalResult mlir::Op<circt::comb::AndOp, mlir::OpTrait::ZeroRegions, mlir::OpTrait::OneResult, mlir::OpTrait::OneTypedResult<circt::hw::TypeVariant<mlir::IntegerType, circt::hw::IntType> >::Impl, mlir::OpTrait::ZeroSuccessors, mlir::OpTrait::VariadicOperands, mlir::OpTrait::OpInvariants, mlir::BytecodeOpInterface::Trait, mlir::OpTrait::IsCommutative, mlir::OpTrait::SameTypeOperands, mlir::OpTrait::SameOperandsAndResultType, mlir::ConditionallySpeculatable::Trait, mlir::OpTrait::AlwaysSpeculatableImplTrait, mlir::MemoryEffectOpInterface::Trait>::foldSingleResultHook<circt::comb::AndOp>(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) /home/jodemme/circt/llvm/llvm/../mlir/include/mlir/IR/OpDefinition.h:1907:38
#13 0x000055e359500011 mlir::Op<circt::comb::AndOp, mlir::OpTrait::ZeroRegions, mlir::OpTrait::OneResult, mlir::OpTrait::OneTypedResult<circt::hw::TypeVariant<mlir::IntegerType, circt::hw::IntType> >::Impl, mlir::OpTrait::ZeroSuccessors, mlir::OpTrait::VariadicOperands, mlir::OpTrait::OpInvariants, mlir::BytecodeOpInterface::Trait, mlir::OpTrait::IsCommutative, mlir::OpTrait::SameTypeOperands, mlir::OpTrait::SameOperandsAndResultType, mlir::ConditionallySpeculatable::Trait, mlir::OpTrait::AlwaysSpeculatableImplTrait, mlir::MemoryEffectOpInterface::Trait>::getFoldHookFn()::'lambda'(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&)::operator()(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const /home/jodemme/circt/llvm/llvm/../mlir/include/mlir/IR/OpDefinition.h:1883:16
#14 0x000055e3594fffbd llvm::LogicalResult llvm::detail::UniqueFunctionBase<llvm::LogicalResult, mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&>::CallImpl<mlir::Op<circt::comb::AndOp, mlir::OpTrait::ZeroRegions, mlir::OpTrait::OneResult, mlir::OpTrait::OneTypedResult<circt::hw::TypeVariant<mlir::IntegerType, circt::hw::IntType> >::Impl, mlir::OpTrait::ZeroSuccessors, mlir::OpTrait::VariadicOperands, mlir::OpTrait::OpInvariants, mlir::BytecodeOpInterface::Trait, mlir::OpTrait::IsCommutative, mlir::OpTrait::SameTypeOperands, mlir::OpTrait::SameOperandsAndResultType, mlir::ConditionallySpeculatable::Trait, mlir::OpTrait::AlwaysSpeculatableImplTrait, mlir::MemoryEffectOpInterface::Trait>::getFoldHookFn()::'lambda'(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const>(void*, mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) /home/jodemme/circt/llvm/llvm/include/llvm/ADT/FunctionExtras.h:222:12
#15 0x000055e3590321ff llvm::unique_function<llvm::LogicalResult (mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const>::operator()(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const /home/jodemme/circt/llvm/llvm/include/llvm/ADT/FunctionExtras.h:413:12
#16 0x000055e3594ff41e mlir::RegisteredOperationName::Model<circt::comb::AndOp>::foldHook(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) /home/jodemme/circt/llvm/llvm/../mlir/include/mlir/IR/OperationSupport.h:538:14
#17 0x000055e35b88470e mlir::OperationName::foldHook(mlir::Operation*, llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) const /home/jodemme/circt/llvm/mlir/include/mlir/IR/OperationSupport.h:265:23
#18 0x000055e35b87c9c8 mlir::Operation::fold(llvm::ArrayRef<mlir::Attribute>, llvm::SmallVectorImpl<mlir::OpFoldResult>&) /home/jodemme/circt/llvm/mlir/lib/IR/Operation.cpp:636:22
#19 0x000055e35b87cdfe mlir::Operation::fold(llvm::SmallVectorImpl<mlir::OpFoldResult>&) /home/jodemme/circt/llvm/mlir/lib/IR/Operation.cpp:666:10
...

Will try to reduce this test case further.

@teqdruid teqdruid added the Comb Involving the `comb` dialect label Jan 2, 2025
@uenoku
Copy link
Member

uenoku commented Jan 2, 2025

Seems like AndOp canoncalizer is creating zero-operand operation.

//===-------------------------------------------===//
Processing operation : 'comb.and'(0x4b3c980) {
  %4 = "comb.and"(%18, %2) <{twoState}> : (i1, i1) -> i1


  * Pattern  : 'comb.and -> ()' {
Trying to match ""
    ** Insert  : 'comb.and'(0x4b528c0)
    ** Replace : 'comb.and'(0x4b3c980)
    ** Modified: 'hw.output'(0x4b22000)
    ** Modified: 'hw.output'(0x4b22000)
    ** Erase   : 'comb.and'(0x4b3c980)
"" result 1
  } -> success : pattern applied successfully
// *** IR Dump After Pattern Application ***
'comb.and' op expected 1 or more operands, but found 0
mlir-asm-printer: 'hw.module' failed to verify and will be printed in generic form
"hw.module"() <{module_type = !hw.modty<input clk : i1, input rst : i1, input sel : !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>, input sel_valid : i1, input data : i64, input data_valid : i1, input read_data_ready : i1, output sel_ready : i1, output data_ready : i1, output read_data : !hw.typealias<@pycde::@AXI_Lite_Read_Resp, !hw.struct<data: i32, resp: i2>>, output read_data_valid : i1>, parameters = [], result_locs = [loc("a.mlir":7:226), loc("a.mlir":7:246), loc("a.mlir":7:267), loc("a.mlir":7:360)], sym_name = "MMIOAxiReadWriteDemux"}> ({
^bb0(%arg0: i1, %arg1: i1, %arg2: !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>, %arg3: i1, %arg4: i64, %arg5: i1, %arg6: i1):
  %0 = "hw.constant"() <{value = true}> : () -> i1
  %1 = "hw.constant"() <{value = 0 : i2}> : () -> i2
  %2 = "comb.and"(%arg5, %arg3) <{twoState}> : (i1, i1) -> i1
  %3 = "hw.struct_create"(%arg4, %arg2) : (i64, !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>) -> !hw.struct<a: i64, b: !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>>
  %4 = "comb.and"() <{twoState}> : () -> i1
  %5 = "hw.struct_extract"(%3) <{fieldIndex = 0 : i32}> : (!hw.struct<a: i64, b: !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>>) -> i64
  %6 = "hw.struct_extract"(%3) <{fieldIndex = 1 : i32}> : (!hw.struct<a: i64, b: !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>>) -> !hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>
  %7 = "hw.struct_extract"(%6) <{fieldIndex = 1 : i32}> : (!hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>) -> i1
  %8 = "comb.extract"(%5) <{lowBit = 0 : i32}> : (i64) -> i32
  %9 = "comb.extract"(%5) <{lowBit = 32 : i32}> : (i64) -> i32
  %10 = "comb.mux"(%7, %9, %8) <{twoState}> {sv.namehint = "mux_None_in0_in1"} : (i1, i32, i32) -> i32
  %11 = "hw.struct_create"(%10, %1) : (i32, i2) -> !hw.typealias<@pycde::@AXI_Lite_Read_Resp, !hw.struct<data: i32, resp: i2>>
  %12 = "hw.struct_extract"(%6) <{fieldIndex = 0 : i32}> : (!hw.typealias<@pycde::@MMIOSel, !hw.struct<write: i1, upper: i1>>) -> i1
  %13 = "comb.xor"(%12, %0) <{twoState}> : (i1, i1) -> i1
  %14 = "comb.and"(%2, %13) <{twoState}> : (i1, i1) -> i1
  %15 = "comb.and"(%14, %18) <{twoState}> : (i1, i1) -> i1
  %16 = "comb.and"(%2, %12) <{twoState}> : (i1, i1) -> i1
  %17 = "comb.and"(%16, %18) <{twoState}> : (i1, i1) -> i1
  %18 = "comb.and"(%arg6, %17) <{twoState}> : (i1, i1) -> i1
  "hw.output"(%4, %4, %11, %15) : (i1, i1, !hw.typealias<@pycde::@AXI_Lite_Read_Resp, !hw.struct<data: i32, resp: i2>>, i1) -> ()
}) {output_file = #hw.output_file<"MMIOAxiReadWriteDemux.sv", includeReplicatedOps>} : () -> ()

@teqdruid
Copy link
Contributor Author

teqdruid commented Jan 2, 2025

I was able to further reduce it. It seems to be a cycle issue.

@teqdruid
Copy link
Contributor Author

teqdruid commented Jan 2, 2025

I think the problem is in canonicalizeIdempotentInputs

@teqdruid
Copy link
Contributor Author

teqdruid commented Jan 2, 2025

I think I understand why canonicalizeIdempotentInputs is producing incorrect output, but I'm not seeing a minimal way to fix it. @dtzSiFive do you have any suggestions?

@dtzSiFive
Copy link
Contributor

dtzSiFive commented Jan 3, 2025

Taking a look, thanks for the ping and reproducer. This looked familiar, I've seen the zero-operands IR from fuzzing.
Digging a bit, FWIW it looks like this bug existed prior to #7514 (first released in firtool 1.82.0), and with 1.80.0 we produced invalid IR but did not crash. In 1.81.0 we crash. I'll go through my fuzzing results later.

Anyway, while poking at this here's a related issue that we process indefinitely and never terminate (same circt-opt -canonicalize invocation):

hw.module @MMIOAxiReadWriteDemux() {
  %0 = comb.and bin %0 : i1
}

EDIT: The above example is in the folder, taking the and(x) -> x path repeatedly.

@dtzSiFive
Copy link
Contributor

dtzSiFive commented Jan 3, 2025

My simple suggestion that I'm still thinking through is to just check if the computed uniqueInputs is empty and do nothing in that case.
I'm not convinced, as I think any cycle in the explored operand graph is grounds to abort any attempt to reason about this (?).

IIRC a number of things in our compiler fall over in the presence of these sorts of cycles, FWIW.

A small variation that canonicalizeIdempotentInputs will drop the comb cycle portion of the operation:

hw.module @MMIOAxiReadWriteDemux(in %data_valid : i1, out sel_ready : i1) {
  %0 = comb.and bin %2, %data_valid : i1
  %2 = comb.and bin %2 : i1
  hw.output %0 : i1
}

-> %0 = comb.and bin %data_valid : i1 -> %data_valid, at least once working around folder infinite loop'ing on the cycle. How do we feel about that? 🙃 . Dropping cyclic portions? 😦 .

(and similarly:

hw.module @MMIOAxiReadWriteDemux(in %x: i1, in %y: i1, out z : i1) {
  %0 = comb.and bin %1, %x: i1
  %1 = comb.and bin %1, %y: i1
  hw.output %0 : i1
}

Here, canonicalizeIdempotentInputs turns %0 into %0 = comb.and %x 🤔).

I'm trying to determine what this IR even means, and whether it might be invalid (and if so what might we most reasonably do about it that isn't crashing/looping!). Thoughts?

@teqdruid
Copy link
Contributor Author

teqdruid commented Jan 3, 2025

My simple suggestion that I'm still thinking through is to just check if the computed uniqueInputs is empty and do nothing in that case.

I'm not convinced that would cover all of the potential bugs. I could see a case where on of multiple inputs is part of a cycle.

I'm trying to determine what this IR even means

This particular case was a bug in user code (PyCDE). I fixed the bug and avoided this particular bug.

and whether it might be invalid (and if so what might we most reasonably do about it that isn't crashing/looping!). Thoughts?

Cycles in combinational graphs should be totally legal IMO. It should be possible to build ring oscillators in CIRCT! Now it's totally acceptable to just give up on any and all optimizations if a cycle is present. Whether or not frontends allow this behavior is another question.

-> %0 = comb.and bin %data_valid : i1 -> %data_valid, at least once working around folder infinite loop'ing on the cycle. How do we feel about that? 🙃 . Dropping cyclic portions? 😦

In addition to my desire to have this be legal, this is totally unintuitive behavior especially in a canonicalizer. Since this was a bug in client code, I would at least like CIRCT to crap out with a decent error message. In nearly all cases, a comb cycle represents a bug, so we shouldn't be silently dropping it.

I was thinking about writing a comb cycle analysis pass (I couldn't find one in the comb dialect) for PyCDE to optionally run to detect these sorts of bugs.

@dtzSiFive
Copy link
Contributor

I'm not convinced that would cover all of the potential bugs. I could see a case where on of multiple inputs is part of a cycle.

Yep, I believe that's the data_valid variation I posted above (the %2 operand is a cycle, other is not).

Looks like our folders, flattening, canonicalizeIdempotentInputs (probably others, I haven't looked) all are various sorts of wrong in the presence of combinatorial cycles for many comb operations -- ranging from dropping cyclic portions (above), crashing/invalid IR (this issue but many related across other comb ops), infinite loops, or dropping gates in the cycle as part of canonicalization which probably isn't desired (?) (see below, not gate cycle snippet).

I think something pervasive and perhaps fundamental in how our code (especially in canonicalization, probably in the export path as well) views the IR is at odds with what's desired when generating IR containing combinatorial cycles intentionally -- each gate is significant, vs the behavior it encodes functionally/"mathematically" (viewed as an expression, a perspective from which cycles are at least weird 😄).

Not gate cycle:

hw.module @foo(out y : i1) {
  %one = hw.constant 1 : i1
  %0 = comb.xor %10, %one : i1
  %1 = comb.xor %0, %one : i1
  %2 = comb.xor %1, %one : i1
  %3 = comb.xor %2, %one : i1
  %4 = comb.xor %3, %one : i1
  %5 = comb.xor %4, %one : i1
  %6 = comb.xor %5, %one : i1
  %7 = comb.xor %6, %one : i1
  %8 = comb.xor %7, %one : i1
  %9 = comb.xor %8, %one : i1
  %10 = comb.xor %9, %one : i1
  hw.output %10 : i1
}

Which CIRCT canonicalizes down to a single not, since that appears "equivalent" but, is it?

(change to even number of gates for canonicalizer infinite loop, FWIW)


Cycles in combinational graphs should be totally legal IMO. It should be possible to build ring oscillators in CIRCT! Now it's totally acceptable to just give up on any and all optimizations if a cycle is present. Whether or not frontends allow this behavior is another question.

Thanks, that makes sense to me! Don't love disallowing things if we can avoid it 👍 .

Disabling optimizations (for lack of a better plan at least in the short term) and carefully emitting IR containing combinatorial cycles seems like the way to go.

Doing this automatically and selectively in our pipeline is an interesting question.

What do various tools/flows do with this sort of thing? Are these put in modules / blocks that are specially annotated with directives, that sort of thing? (How is this modeled / conceptualized / managed elsewhere in other tools, languages, so on?)

In addition to my desire to have this be legal, this is totally unintuitive behavior especially in a canonicalizer. Since this was a bug in client code, I would at least like CIRCT to crap out with a decent error message. In nearly all cases, a comb cycle represents a bug, so we shouldn't be silently dropping it.

I agree. And of course it shouldn't crash either!

I was thinking about writing a comb cycle analysis pass (I couldn't find one in the comb dialect) for PyCDE to optionally run to detect these sorts of bugs.

A comb cycle analysis pass sounds great. I wonder how to craft this to be reasonable in presence of various complicated constructs or interactions with other (unknown?) dialects, precision re:bit-sensitivity, so on? Sounds like a great addition in whatever form!

If our folders and canonicalizers need to know which operations are part of cycles, IMO having a pass detect these (and mark or error?) is a reasonable way to go (vs folders/canonicalizers needing to check this themselves, everywhere/repeatedly).

And just to re-iterate: many folders/canonicalizers break in some form with combinatorial cycles (ignoring how to encode when preserving the gate count/structure is important). Let's get consensus on how we want to handle this and then "someone" can wade through and look for and address these appropriately (or file issues).

WDYT?

@teqdruid
Copy link
Contributor Author

teqdruid commented Jan 6, 2025

I think I agree with all of that.

But I think the root of the problem is that we're misusing canonicalizers to do optimizations. Canonicalizers should be used to clean up IR to enable optimizations to run better. Anything which could break in the presence of cycles should be in an optional pass. This is a long-running issue which needs to be fixed.

A comb cycle analysis pass sounds great. I wonder how to craft this to be reasonable in presence of various complicated constructs or interactions with other (unknown?) dialects, precision re:bit-sensitivity, so on? Sounds like a great addition in whatever form!

I was just gonna do cycles in the comb dialect ops. I think this would cover 95% of cases I care about. But we could introduce a "combinational" trait to enable dialects to mark ops as combinational. In particular, some ops in the sv dialect could be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Comb Involving the `comb` dialect
Projects
None yet
Development

No branches or pull requests

3 participants