-
Notifications
You must be signed in to change notification settings - Fork 252
Cross checks design
We need to insert cross-checks on both the C and Rust code, and the cross-checks need to (mostly) match. We assume that we have 2 parallel compilation pipelines, one per language:
- C -> clang -> clang AST -> clang IR emitter -> LLVM IR -> LLVM compilation -> output executable
- C -> clang -> clang AST -> C2Rust transpiler -> Rust -> Rust frontend -> LLVM IR -> LLVM compilation -> output executable
We can insert our cross-checks in several places in both pipelines: directly on C code, on clang AST, or on LLVM IR. We believe the most flexible solution is a hybrid approach.
To reduce the C2Rust user's porting effort, we should automatically insert as many cross-checks as possible. Additionally, most of the cross-checks should not be visible in Rust code, which means that the best place to insert them is the LLVM IR. However, this has a few drawbacks:
- Late in the pipeline, the LLVM IRs might not exactly match between the 2 front-end languages.
For example, the C function
void foo(const char *arr, size_t len)
might have been translated tofn foo(arr: &[u8])
. This could be due to either automatic or manual refactoring. - Some front-end information might not be available in the IR, e.g., type information. For example, C structures do not correspond 1:1 to LLVM IR structures, and we might need the former to implement more advanced checks.
For these reasons, we propose a hybrid approach: the LLVM backend automatically inserts cross-checks, but also provides a cross-check mutator interface that lets the Rust code do the following:
- Insert new cross-checks (in case the refactoring removes or refactors C code)
- Remove implicit cross-checks, e.g., where the Rust code adds additional functions or other code that isn't present in the C version
- Mutate the implicit cross-checks, e.g. see the
foo
function above
For mutators, we may want to support certain implicit mutations. For example, we could always assume that a (const char*, size_t)
pair of C values always corresponds to a str
string in Rust.
Initially, we want to add a cross-check on each function entry and exit. As a starting point, we will only check which functions get called, and later add checks for the values of the arguments. The latter could prove very tricky, as pointer, array and structure arguments could pose problems.