Gathering function information Given this line: #pragma rsfn VecInt_with_capacity = VecInt::with_capacity The tool needs to generate this Rust code: rust #[no_mangle] pub extern "C" fn VecInt_with_capacity(param_0_c: usize) -> VecInt { let param_0_rs: usize = unsafe { mem::transmute(param_0_c) }; let result_rs: alloc::vec::Vec::<u64> = alloc::vec::Vec::<u64>::with_capacity(param_0_rs); let result_c: VecInt = unsafe { mem::transmute(result_rs) }; return result_c; }
To do that, it needs to know: The real name: alloc::vec::Vec::<u64>::with_capacity
The return type: alloc::vec::Vec::<u64>
The parameter types: usize
These are actually pretty tricky to determine.
I was hoping that I could just use std::any::type_name (playground link) or say this: rust println!("fn {}", std::signature_of::<Vec<int>::with_capacity>()); But there is no signature_of, I just made that up.
This is the part I left out of the last post because the path from here becomes treacherous... and the only way forward is too arcane, too terrible to consider.
A horrifying idea (Or, skip to the final approach!)
I considered a few other options, such as using macros to search through functions, or using the syn crate... but neither of those approaches has access to the information we need.
If we could just see all the functions for a given type, then perhaps we could search them for the correct overload, and print out their parameters.
So I thought, let's read rustc's MIR!
MIR is rustc's "Mid-level Intermediate Representation". rustc turns Rust source code into HIR, then MIR, then eventually, LLVM IR, which later gets turned into assembly code.
If we could search through the Rust libraries' MIR, maybe we could find the right function and print its parameters out!
An even better horrifying idea But first, I asked on the Rust discord server if that really was the best way to get all the structs and functions for a target crate.
Luckily, Helix/noop_noob and anden3 arrived and told me a much better solution: get rustdoc's JSON output and read it with rustdoc_types! That's probably way, way easier than reading MIR. To this day, I don't know if using MIR would have worked well. Bless these two heroes, and may their names ever live on in glory.
Of course, I was then met with horrified reactions when I explained what I'd use it for. I hope they never learn of the dark sorcery they helped me unleash on this world.
Let's be clear: rustdoc was not made for this. It was made to generate lovely HTML pages like this one. And even though it has JSON output, it's not even stabilized, and organizes its information for, you know, making documentation. What could go wrong?
Anyway, I made the tool invoke rustdoc 10 and load the resulting JSON in with rustdoc_types and serde.
Then, the tool reads every single fn, struct, impl, trait, type, and use in every crate, and collects their relationships into a bunch of hash maps. 11 12
The quest begins!
At first, things were pretty easy. rustdoc_types::Function has pretty straightforward information: rust pub struct Function { pub decl: FnDecl, // Parameters, return types pub generics: Generics, // <T>, where, etc. pub header: Header, // const, unsafe, async, etc. pub has_body: bool, }
It wasn't terribly hard to loop through all the functions and compare their types to the ones supplied by the user.
Though, there were some tricky parts: To find a struct's method, we need to find all impl s for that struct, and look through all of them.
s for that struct, and look through all of them. Sometimes, a struct's method is defined in a different crate entirely. 13
Types often didn't know that they were using the default drop method.
_Unwind_Reason_Code and some things in std::detect are referenced but somehow don't exist.
But it all worked! 14
At least, until I tried this line: #pragma rsfn OsString_from_str = std::ffi::OsString::from ...which made the program burst into flames.
Why's that? Because there are multiple overloads for the OsString::from function! The tool couldn't figure out which to use.
Rust Function Overloads Some already know about function overloading in Rust, but for those unfamiliar with the term, Wikipedia says: Function overloading is the ability to create multiple functions of the same name with different implementations. Calls to an overloaded function will run a specific implementation of that function appropriate to the context of the call, allowing one function call to perform different tasks depending on context.
And unfortunately for us, that's exactly what's happening here with the from method.
You see, we actually want to call this from method at os_str.rs:1595: rust impl<T: ?Sized + AsRef<OsStr>> From<&T> for OsString { fn from(s: &T) -> OsString { s.as_ref().to_os_string() } } ...because our &str is a kind of AsRef<OsStr>, as specified by os_str.rs:574: rust impl AsRef<OsStr> for str { ... }
However, we don't want to call this from function at os_str.rs:563 which makes an OsString from a String: rust impl From<String> for OsString { fn from(s: String) -> OsString { OsString { inner: Buf::from_string(s) } } }
And we also don't want to call this from function at convert/mod.rs:765, which can turn any T into itself: rust impl<T> From<T> for T { fn from(t: T) -> T { t } } For clarity, here's that last one again, but replacing T with OsString: rust impl From<OsString> for OsString { fn from(t: OsString) -> OsString { t } }
In other words, there are three overloads: OsString::from(t: OsString)
OsString::from(s: String)
OsString::from(s: T) where T: ?Sized + AsRef<OsStr>