▲ | groos 3 days ago | |
The two main parts of a typical C++ compiler are the front-end, which handles language syntax and semantic analysis, and the back-end, which handles code generation. C++ makes it difficult to implement the front-end as a multithreaded program because it has context‑sensitive syntax (as does C). The meaning of a construct can change depending on whether a name encountered during parsing refers to an existing declaration or not. As a result, parsing and semantic analysis cannot be easily divided into independent parts to run in parallel, so they must be performed serially. A modern implementation will typically carry out semantic analysis in phases, for example binding names first, then analyzing types, and so on, before lowering the resulting representation to a form suitable for code generation. Generally speaking, declarations that introduce names into non‑local scopes must be compiled serially. This also makes the symbol table a limiting factor for parallelism, since it must be accessed in a mutually exclusive manner. _Some_ constructs can be compiled in parallel, such as function bodies and function template instantiations, but given that build systems already implement per‑translation‑unit parallelism, the additional effort is often not worthwhile. In contrast, a language like C# is designed with context‑free syntax. This allows a top‑level fast parse to break up the source file (there are no #include's in C#) into declarations that can, in principle, be processed in parallel. There will still be dependencies between declarations, and these will limit parallelism. But given that C# source files are a tiny fraction of the size of a typical C++ translation unit, even here parallel compilation is probably not a big win. The C++ back-end can take advantage of multithreading far more than the front end. Once global optimizations are complete, the remaining work can be queued in parallel for code generation. MSVC works in exactly this way and provides options to control this parallelism. However, parallelism is limited by Amdahl’s Law, specifically the need to read in the IR generated by the front-end and to perform global optimizations. |