Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> None of the Rust complexity leaks over into the LLVM IR it generates

The leakage even triggers performance issues in compilers: https://users.rust-lang.org/t/5-hours-to-compile-macro-what-...



No, that's not "Rust leakage", because that misbehavior has nothing to do with the semantics of Rust as a language. That's the effect of some non-optimized function-scope-level pattern-matching substitutions that LLVM is applying, as applied to the user's very large function body. Quoting a reply in that same thread:

> LLVM is not good at big functions, I have seen this before as well. One of the functions you are generating is almost 100,000 lines of Rust code including tons of internal control flow.

The same would happen if you wrote (or generated) 100,000 lines of C in one function body, and fed it through Clang with -O2.

If you're wondering, there's nothing in rustc that makes it implicitly turn regular code into huge function bodies, either. It's just someone abusing Rust macros to codegen badly, not taking into account how the granularity of units of compilation interacts with the time-complexity of optimization passes.


> that misbehavior has nothing to do with the semantics of Rust as a language.

Macros are part of the language; they aren’t an external code generation tool.

> The same would happen if you wrote (or generated) 100,000 lines of C in one function body

Absolutely, but neither C nor C++ makes it easy to generate such code. Technically doable with C preprocessor, but it’s not easy. C++ templates don’t normally expand into 100k lines functions either, they can easily expand into 100k small functions.


People write code generators that emit C/C++ code all the time, and 10KLOC functions coming from such a generator isn't unheard of. The x86 instruction selector LLVM generates from tablegen is over 240KLOC, although that's emitted effectively as an interpreted table, so the actual code size of the function is much, much smaller.


You’re correct that it’s possible to trigger that performance issue without Rust.

However, it doesn’t happen with idiomatic C or C++ code. C macros are too hard to use for that. C++ templates are designed to expand into many small functions, as opposed to a single huge one.

It was Rust complexity leaking over into the LLVM IR.


> C++ templates are designed to expand into many small functions, as opposed to a single huge one.

...so are Rust macros. The user was using Rust macros very non-idiomatically.

Also, you don't use C macros to write (large amounts of) C. 90% of the world's generated C, by volume, comes from, I would say, one of two programs: autoconf (through m4), or yacc. And both autoconf check stanzas and yacc's inline C support are easy to screw up in ways that result in horrible code that makes compilers barf.

Also, to be clear, you're conflating two things with the phrase "Rust complexity." This thread was originally about the difficulty of implementing a Rust compiler for a toolchain, which therefore makes "Rust's complexity" here refer specifically to the complexity of the static-analysis semantics of Rust that make writing a Rust compiler frontend difficult.

The solution—relying on the existing rustc frontend for the "hard stuff", and just implementing your own compiler backend to target the embedded architecture—is not rebutted by a claim that you can write bad Rust code that makes the optimization passes of certain compilers choke. You wouldn't even be writing optimization passes. You'd be writing a backend. The optimization passes are universal. It's the job of the LLVM developers to ensure that they don't choke on things like this.

And, despite the fact that 100,000-line functions are dumb, it's the LLVM development team's fault that the optimization passes were written in such a way that their time-complexity is superlinear in a way that stalls out when optimizing a function like that. It's not up to the Rust compiler authors to not emit that type of code, or to somehow prevent you from writing it; it's up to the LLVM devs to make their optimizer work, and work efficiently, for any "valid" code. Honestly, if anything, this code—if you could get it—would be a great integration test to submit to LLVM, for them to work towards passing.


At the scale of macro complexity that occurred in the Rust code, you would be dealing with autogenerated C/C++ code. Autogenerated C/C++ code of that complexity absolutely exists in practice.

The problem here isn't Rust's complexity; it's autogeneration taken to the extremes without concern for its impact on compile time. The fact that Rust provides builtin macro syntax to do this autogeneration is of little importance, since it will happen just as frequently with external build tools in the C/C++ world.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: