Variable Tracking at Assignments (VTA) is a new infrastructure included in GCC used to improve variable tracking during optimizations. This allows GCC to produce more precise, meaningful, and useful debugging information for GDB, SystemTap, and other debugging tools.
When GCC compiles code with optimizations enabled, variables are renamed, moved around, or even removed altogether. As such, optimized compiling can cause a debugger to report that some variables have been <optimized out>. With VTA enabled, optimized code is internally annotated to ensure that optimization passes to transparently keep track of each variable's value, regardless of whether the variable is moved or removed. The effect of this is more parameter and variable values available, even for the optimized ( built) code. It also displays the <optimized out> message less.
VTA's benefits are more pronounced when debugging applications with inlined functions. Without VTA, optimization could completely remove some arguments of an inlined function, preventing the debugger from inspecting its value. With VTA, optimization will still happen, and appropriate debugging information will be generated for any missing arguments.
VTA is enabled by default when compiling code with optimizations and debugging information enabled (that is, or, more commonly, ). To disable VTA during such builds, add the . In addition, the VTA infrastructure includes the new option . This option tests code compiled by GCC with debug information and without debug information: the test passes if the two binaries are identical. This test ensures that executable code is not affected by any debugging options, which further ensures that there are no hidden bugs in the debug code. Note that adds significant cost in compilation time. See for details about this option.
For more information about the infrastructure and development of VTA, see A Plan to Fix Local Variable Debug Information in GCC, available at the following link:
A slide deck version of this whitepaper is also available at http://people.redhat.com/aoliva/papers/vta/slides.pdf.
This section describes command-line options that are primarily of interest to GCC developers, including options to support compiler testing and investigation of compiler bugs and compile-time performance problems. This includes options that produce debug dumps at various points in the compilation; that print statistics such as memory use and execution time; and that print information about GCC’s configuration, such as where it searches for libraries. You should rarely need to use any of these options for ordinary compilation and linking tasks.
Says to make debugging dumps during compilation at times specified by . This is used for debugging the RTL-based passes of the compiler. The file names for most of the dumps are made by appending a pass number and a word to the , and the files are created in the directory of the output file. In case of option, the dump is output on the given file instead of the pass numbered dump files. Note that the pass number is assigned as passes are registered into the pass manager. Most passes are registered in the order that they will execute and for these passes the number corresponds to the pass execution order. However, passes registered by plugins, passes specific to compilation targets, or passes that are otherwise registered after all the other passes are numbered higher than a pass named "final", even if they are executed earlier. is generated from the name of the output file if explicitly specified and not an executable, otherwise it is the basename of the source file.
Some switches have different meaning when is used for preprocessing. See Preprocessor Options, for information about preprocessor-specific dump options.
Debug dumps can be enabled with a switch or some option . Here are the possible letters for use in and , and their meanings:
Dump after branch alignments have been computed.
Dump after fixing rtl statements that have unsatisfied in/out constraints.
Dump after auto-inc-dec discovery. This pass is only run on architectures that have auto inc or auto dec instructions.
Dump after cleaning up the barrier instructions.
Dump after partitioning hot and cold basic blocks.
Dump after block reordering.
and enable dumping after the two branch target load optimization passes.
Dump after jump bypassing and control flow optimizations.
Dump after the RTL instruction combination pass.
Dump after duplicating the computed gotos.
, , and enable dumping after the three if conversion passes.
Dump after hard register copy propagation.
Dump after combining stack adjustments.
and enable dumping after the two common subexpression elimination passes.
Dump after the standalone dead code elimination passes.
Dump after delayed branch scheduling.
and enable dumping after the two dead store elimination passes.
Dump after finalization of EH handling code.
Dump after conversion of EH handling range regions.
Dump after RTL generation.
and enable dumping after the two forward propagation passes.
and enable dumping after global common subexpression elimination.
Dump after the initialization of the registers.
Dump after the computation of the initial value sets.
Dump after converting to cfglayout mode.
Dump after iterated register allocation.
Dump after the second jump optimization.
enables dumping after the rtl loop optimization passes.
Dump after performing the machine dependent reorganization pass, if that pass exists.
Dump after removing redundant mode switches.
Dump after register renumbering.
Dump after converting from cfglayout mode.
Dump after the peephole pass.
Dump after post-reload optimizations.
Dump after generating the function prologues and epilogues.
and enable dumping after the basic block scheduling passes.
Dump after sign/zero extension elimination.
Dump after common sequence discovery.
Dump after shortening branches.
Dump after sibling call optimizations.
These options enable dumping after five rounds of instruction splitting.
Dump after modulo scheduling. This pass is only run on some architectures.
Dump after conversion from GCC’s “flat register file” registers to the x87’s stack-like registers. This pass is only run on x86 variants.
and enable dumping after the two subreg expansion passes.
Dump after all rtl has been unshared.
Dump after variable tracking.
Dump after converting virtual registers to hard registers.
Dump after live range splitting.
These dumps are defined but always produce empty files.
Produce all the dumps listed above.
Annotate the assembler output with miscellaneous debugging information.
Dump all macro definitions, at the end of preprocessing, in addition to normal output.
Produce a core dump whenever an error occurs.
Annotate the assembler output with a comment indicating which pattern and alternative is used. The length and cost of each instruction are also printed.
Dump the RTL in the assembler output as a comment before each instruction. Also turns on annotation.
Just generate RTL for a function instead of compiling it. Usually used with .
When doing debugging dumps, suppress address output. This makes it more feasible to use diff on debugging dumps for compiler invocations with different compiler binaries and/or different text / bss / data / heap / stack / dso start locations.
Collect and dump debug information into a temporary file if an internal compiler error (ICE) occurs.
When doing debugging dumps, suppress instruction numbers and address output. This makes it more feasible to use diff on debugging dumps for compiler invocations with different options, in particular with and without .
When doing debugging dumps (see option above), suppress instruction numbers for the links to the previous and next instructions in a sequence.
Control the dumping at various stages of inter-procedural analysis language tree to a file. The file name is generated by appending a switch specific suffix to the source file name, and the file is created in the same directory as the output file. The following dumps are possible:
Enables all inter-procedural analysis dumps.
Dumps information about call-graph optimization, unused function removal, and inlining decisions.
Dump after function inlining.
Control the dumping of language-specific information. The and portions behave as described in the option. The following values are accepted:
Enable all language-specific dumps.
Dump class hierarchy information. Virtual table information is emitted unless ’’ is specified. This option is applicable to C++ only.
Dump the raw internal tree data. This option is applicable to C++ only.
Print on the list of optimization passes that are turned on and off by the current command-line options.
Enable and control dumping of pass statistics in a separate file. The file name is generated by appending a suffix ending in ‘’ to the source file name, and the file is created in the same directory as the output file. If the ‘’ form is used, ‘’ causes counters to be summed over the whole compilation unit while ‘’ dumps every event as the passes generate them. The default with no option is to sum counters for each function compiled.
Control the dumping at various stages of processing the intermediate language tree to a file. The file name is generated by appending a switch-specific suffix to the source file name, and the file is created in the same directory as the output file. In case of option, the dump is output on the given file instead of the auto named dump files. If the ‘’ form is used, is a list of ‘’ separated options which control the details of the dump. Not all options are applicable to all dumps; those that are not meaningful are ignored. The following options are available
Print the address of each node. Usually this is not meaningful as it changes according to the environment and source file. Its primary use is for tying up a dump file with a debug environment.
If has been set for a given decl, use that in the dump instead of . Its primary use is ease of use working backward from mangled names in the assembly file.
When dumping front-end intermediate representations, inhibit dumping of members of a scope or body of a function merely because that scope has been reached. Only dump such items when they are directly reachable by some other path.
When dumping pretty-printed trees, this option inhibits dumping the bodies of control structures.
When dumping RTL, print the RTL in slim (condensed) form instead of the default LISP-like representation.
Print a raw representation of the tree. By default, trees are pretty-printed into a C-like representation.
Enable more detailed dumps (not honored by every dump option). Also include information from the optimization passes.
Enable dumping various statistics about the pass (not honored by every dump option).
Enable showing basic block boundaries (disabled in raw dumps).
For each of the other indicated dump files (), dump a representation of the control flow graph suitable for viewing with GraphViz to . Each function in the file is pretty-printed as a subgraph, so that GraphViz can render them all in a single plot.
This option currently only works for RTL dumps, and the RTL is always dumped in slim form.
Enable showing virtual operands for every statement.
Enable showing line numbers for statements.
Enable showing the unique ID () for each variable.
Enable showing the tree dump for each statement.
Enable showing the EH region number holding each statement.
Enable showing scalar evolution analysis details.
Enable showing optimization information (only available in certain passes).
Enable showing missed optimization information (only available in certain passes).
Enable other detailed optimization information (only available in certain passes).
Instead of an auto named dump file, output into the given file name. The file names and are treated specially and are considered already open standard streams. For example,gcc -O2 -ftree-vectorize -fdump-tree-vect-blocks=foo.dump -fdump-tree-pre=/dev/stderr file.c
outputs vectorizer dump into , while the PRE dump is output on to . If two conflicting dump filenames are given for the same pass, then the latter option overrides the earlier one.
Turn on all options, except , , and .
Turn on all optimization options, i.e., , , and .
To determine what tree dumps are available or find the dump for a pass of interest follow the steps below.
- Invoke GCC with and in the output look for a code that corresponds to the pass you are interested in. For example, the codes , , and correspond to the three Value Range Propagation passes. The number at the end distinguishes distinct invocations of the same pass.
- To enable the creation of the dump file, append the pass code to the option prefix and invoke GCC with it. For example, to enable the dump from the Early Value Range Propagation pass, invoke GCC with the option. Optionally, you may specify the name of the dump file. If you don’t specify one, GCC creates as described below.
- Find the pass dump in a file whose name is composed of three components separated by a period: the name of the source file GCC was invoked to compile, a numeric suffix indicating the pass number followed by the letter ‘’ for tree passes (and the letter ‘’ for RTL passes), and finally the pass code. For example, the Early VRP pass dump might be in a file named in the current working directory. Note that the numeric codes are not stable and may change from one version of GCC to another.
Controls optimization dumps from various optimization passes. If the ‘’ form is used, is a list of ‘’ separated option keywords to select the dump details and optimizations.
The can be divided into two groups: options describing the verbosity of the dump, and options describing which optimizations should be included. The options from both the groups can be freely mixed as they are non-overlapping. However, in case of any conflicts, the later options override the earlier options on the command line.
The following options control the dump verbosity:
Print information when an optimization is successfully applied. It is up to a pass to decide which information is relevant. For example, the vectorizer passes print the source location of loops which are successfully vectorized.
Print information about missed optimizations. Individual passes control which information to include in the output.
Print verbose information about optimizations, such as certain transformations, more detailed messages about decisions etc.
Print detailed optimization information. This includes ‘’, ‘’, and ‘’.
One or more of the following option keywords can be used to describe a group of optimizations:
Enable dumps from all interprocedural optimizations.
Enable dumps from all loop optimizations.
Enable dumps from all inlining optimizations.
Enable dumps from all OMP (Offloading and Multi Processing) optimizations.
Enable dumps from all vectorization optimizations.
Enable dumps from all optimizations. This is a superset of the optimization groups listed above.
If is omitted, it defaults to ‘’, which means to dump all info about successful optimizations from all the passes.
If the is provided, then the dumps from all the applicable optimizations are concatenated into the . Otherwise the dump is output onto . Though multiple options are accepted, only one of them can include a . If other filenames are provided then all but the first such option are ignored.
Note that the output is overwritten in case of multiple translation units. If a combined output from multiple translation units is desired, should be used instead.
In the following example, the optimization info is output to :
outputs missed optimization report from all the passes into , and this one:
prints information about missed optimization opportunities from vectorization passes on . Note that is equivalent to . The order of the optimization group names and message types listed after does not matter.
As another example,
outputs information about missed optimizations as well as optimized locations from all the inlining passes into .
Here the two output filenames and are in conflict since only one output file is allowed. In this case, only the first option takes effect and the subsequent options are ignored. Thus only is produced which contains dumps from the vectorizer about missed opportunities.
On targets that use instruction scheduling, this option controls the amount of debugging output the scheduler prints to the dump files.
For greater than zero, outputs the same information as and . For greater than one, it also output basic block probabilities, detailed ready list information and unit/insn info. For greater than two, it includes RTL at abort point, control-flow and regions info. And for over four, also includes dependence info.
This is a set of options that are used to explicitly disable/enable optimization passes. These options are intended for use for debugging GCC. Compiler users should use regular options for enabling/disabling passes instead.
Disable IPA pass . is the pass name. If the same pass is statically invoked in the compiler multiple times, the pass name should be appended with a sequential number starting from 1.
Disable RTL pass . is the pass name. If the same pass is statically invoked in the compiler multiple times, the pass name should be appended with a sequential number starting from 1. is a comma-separated list of function ranges or assembler names. Each range is a number pair separated by a colon. The range is inclusive in both ends. If the range is trivial, the number pair can be simplified as a single number. If the function’s call graph node’s falls within one of the specified ranges, the is disabled for that function. The is shown in the function header of a dump file, and the pass names can be dumped by using option .
Disable tree pass . See for the description of option arguments.
Enable IPA pass . is the pass name. If the same pass is statically invoked in the compiler multiple times, the pass name should be appended with a sequential number starting from 1.
Enable RTL pass . See for option argument description and examples.
Enable tree pass . See for the description of option arguments.
Here are some examples showing uses of these options.
Enable internal consistency checking. The default depends on the compiler configuration. enables further internal consistency checking that might affect code generation.
This option provides a seed that GCC uses in place of random numbers in generating certain symbol names that have to be different in every compiled file. It is also used to place unique stamps in coverage data files and the object files that produce them. You can use the option to produce reproducibly identical object files.
The can either be a number (decimal, octal or hex) or an arbitrary string (in which case it’s converted to a number by computing CRC32).
The should be different for every file you compile.
Store the usual “temporary” intermediate files permanently; place them in the current directory and name them based on the source file. Thus, compiling with produces files and , as well as . This creates a preprocessed output file even though the compiler now normally uses an integrated preprocessor.
When used in combination with the command-line option, is sensible enough to avoid over writing an input source file with the same extension as an intermediate file. The corresponding intermediate file may be obtained by renaming the source file before using .
If you invoke GCC in parallel, compiling several different source files that share a common base name in different subdirectories or the same source file compiled for multiple output destinations, it is likely that the different parallel compilers will interfere with each other, and overwrite the temporary files. For instance:
may result in and being written to simultaneously by both compilers.
Store the usual “temporary” intermediate files permanently. If the option is used, the temporary files are based on the object file. If the option is not used, the switch behaves like .
creates , , , , , , and .
Report the CPU time taken by each subprocess in the compilation sequence. For C source files, this is the compiler proper and assembler (plus the linker if linking is done).
Without the specification of an output file, the output looks like this:
The first number on each line is the “user time”, that is time spent executing the program itself. The second number is “system time”, time spent executing operating system routines on behalf of the program. Both numbers are in seconds.
With the specification of an output file, the output is appended to the named file, and it looks like this:
The “user time” and the “system time” are moved before the program name, and the options passed to the program are displayed, so that one can later tell what file was being compiled, and with which options.
Dump the final internal representation (RTL) to . If the optional argument is omitted (or if is ), the name of the dump file is determined by appending to the compilation output file name.
If no error occurs during compilation, run the compiler a second time, adding and to the arguments passed to the second compilation. Dump the final internal representation in both compilations, and print an error if they differ.
If the equal sign is omitted, the default is used.
The environment variable , if defined, non-empty and nonzero, implicitly enables . If is defined to a string starting with a dash, then it is used for , otherwise the default is used.
, with the equal sign but without , is equivalent to , which disables the dumping of the final representation and the second compilation, preventing even from taking effect.
To verify full coverage during testing, set to say , which GCC rejects as an invalid option in any actual compilation (rather than preprocessing, assembly or linking). To get just a warning, setting to ‘’ will do.
This option is implicitly passed to the compiler for the second compilation requested by , along with options to silence warnings, and omitting other options that would cause side-effect compiler outputs to files or to the standard output. Dump files and preserved temporary files are renamed so as to contain the additional extension during the second compilation, to avoid overwriting those generated by the first.
When this option is passed to the compiler driver, it causes the first compilation to be skipped, which makes it useful for little other than debugging the compiler proper.
Turn off generation of debug info, if leaving out this option generates it, or turn it on at level 2 otherwise. The position of this argument in the command line does not matter; it takes effect after all other options are processed, and it does so only once, no matter how many times it is given. This is mainly intended to be used with .
Toggle , in the same way that toggles .
Makes the compiler print out each function name as it is compiled, and print some statistics about each pass when it finishes.
Makes the compiler print some statistics about the time consumed by each pass when it finishes.
Record the time consumed by infrastructure parts separately for each pass.
Control the verbosity of the dump file for the integrated register allocator. The default value is 5. If the value is greater or equal to 10, the dump output is sent to stderr using the same format as minus 10.
Prints a report with internal details on the workings of the link-time optimizer. The contents of this report vary from version to version. It is meant to be useful to GCC developers when processing object files in LTO mode (via ).
Disabled by default.
Like , but only print for the WPA phase of Link Time Optimization.
Makes the compiler print some statistics about permanent memory allocation when it finishes.
Makes the compiler print some statistics about permanent memory allocation for the WPA phase only.
Makes the compiler print some statistics about permanent memory allocation before or after interprocedural optimization.
Makes the compiler print some statistics about consistency of the (estimated) profile and effect of individual passes.
Makes the compiler output stack usage information for the program, on a per-function basis. The filename for the dump is made by appending to the . is generated from the name of the output file, if explicitly specified and it is not an executable, otherwise it is the basename of the source file. An entry is made up of three fields:
- The name of the function.
- A number of bytes.
- One or more qualifiers: , , .
The qualifier means that the function manipulates the stack statically: a fixed number of bytes are allocated for the frame on function entry and released on function exit; no stack adjustments are otherwise made in the function. The second field is this fixed number of bytes.
The qualifier means that the function manipulates the stack dynamically: in addition to the static allocation described above, stack adjustments are made in the body of the function, for example to push/pop arguments around function calls. If the qualifier is also present, the amount of these adjustments is bounded at compile time and the second field is an upper bound of the total amount of stack used by the function. If it is not present, the amount of these adjustments is not bounded at compile time and the second field only represents the bounded part.
Emit statistics about front-end processing at the end of the compilation. This option is supported only by the C++ front end, and the information is generally only useful to the G++ development team.
Print the name and the counter upper bound for all debug counters.
Set the internal debug counter upper bound. is a comma-separated list of : pairs which sets the upper bound of each debug counter to . All debug counters have the initial upper bound of ; thus returns true always unless the upper bound is set by this option. For example, with , returns true only for first 10 invocations.
Print the full absolute name of the library file that would be used when linking—and don’t do anything else. With this option, GCC does not compile or link anything; it just prints the file name.
Print the directory name corresponding to the multilib selected by any other switches present in the command line. This directory is supposed to exist in .
Print the mapping from multilib directory names to compiler switches that enable them. The directory name is separated from the switches by ‘’, and each switch starts with an ‘’ instead of the ‘’, without spaces between multiple switches. This is supposed to ease shell processing.
Print the path to OS libraries for the selected multilib, relative to some subdirectory. If OS libraries are present in the subdirectory and no multilibs are used, this is usually just , if OS libraries are present in sibling directories this prints e.g. , or , or if OS libraries are present in subdirectories it prints e.g. , or .
Print the path to OS libraries for the selected multiarch, relative to some subdirectory.
Like , but searches for a program such as .
Same as .
This is useful when you use or but you do want to link with . You can do:
Print the name of the configured installation directory and a list of program and library directories searches—and don’t do anything else.
This is useful when prints the error message ‘’. To resolve this you either need to put and the other compiler components where expects to find them, or you can set the environment variable to the directory where you installed them. Don’t forget the trailing ‘’. See Environment Variables.
Print the target sysroot directory that is used during compilation. This is the target sysroot specified either at configure time or using the option, possibly with an extra suffix that depends on compilation options. If no target sysroot is specified, the option prints nothing.
Print the suffix added to the target sysroot when searching for headers, or give an error if the compiler is not configured with such a suffix—and don’t do anything else.
Print the compiler’s target machine (for example, ‘’)—and don’t do anything else.
Print the compiler version (for example, , or )—and don’t do anything else. This is the compiler version used in filesystem paths, specs, can be depending on how the compiler has been configured just a single number (major version), two numbers separated by dot (major and minor version) or three numbers separated by dots (major, minor and patchlevel version).
Print the full compiler version, always 3 numbers separated by dots, major, minor and patchlevel version.
Print the compiler’s built-in specs—and don’t do anything else. (This is used when GCC itself is being built.) See Spec Files.