FAST_FUNC not working well with GCC's LTO

Denys Vlasenko vda.linux at googlemail.com
Mon Jan 9 17:54:17 UTC 2017


Applied, thanks.

On Thu, Jan 5, 2017 at 2:25 AM, Kang-Che Sung <explorer09 at gmail.com> wrote:
> (This mail and patch was sent to busybox mailing list on Dec 25, 2016,
> and I'm re-sending again for people to notice.)
>
> Busybox uses FAST_FUNC macro to tweak with IA-32 calling conventions in
> order to make the function call slightly smaller or slightly faster.
> However, when I experiment with GCC's LTO (Link Time Optimization), I
> discovered that FAST_FUNC could hinder LTO's optimization so that the
> resulting executable become a few bytes larger (than what is compiled
> without FAST_FUNC).
>
> Although I can comment out the FAST_FUNC lines in include/platform.h to
> achieve the level of optimization I want, may I suggest a way for user
> to disable FAST_FUNC conveniently?
>
> For example, let me specify CONFIG_EXTRA_CFLAGS="-DFAST_FUNC= -flto"
> and I can compile with LTO without a source code hack. It seems like
> GCC does not yet provide a macro or a way to detect LTO in code, so
> this is the best suggestion I could have.
>
> The changes will be something like below. I would like some comments
> about this problem and my suggestion. Please?
>
> Kang-Che Sung ("Explorer")
>
> --------
>
> diff --git a/include/platform.h b/include/platform.h
> index c987d418c..7e537b950 100644
> --- a/include/platform.h
> +++ b/include/platform.h
> @@ -108,13 +108,19 @@
>   * and/or smaller by using modified ABI. It is usually only needed
>   * on non-static, busybox internal functions. Recent versions of gcc
>   * optimize statics automatically. FAST_FUNC on static is required
> - * only if you need to match a function pointer's type */
> -#if __GNUC_PREREQ(3,0) && defined(i386) /* || defined(__x86_64__)? */
> + * only if you need to match a function pointer's type.
> + * FAST_FUNC may not work well with -flto so allow user to disable this.
> + * (-DFAST_FUNC= ) */
> +#ifndef FAST_FUNC
> +# if __GNUC_PREREQ(3,0) && defined(i386)
>  /* stdcall makes callee to pop arguments from stack, not caller */
> -# define FAST_FUNC __attribute__((regparm(3),stdcall))
> +#  define FAST_FUNC __attribute__((regparm(3),stdcall))
>  /* #elif ... - add your favorite arch today! */
> -#else
> -# define FAST_FUNC
> +/* x86_64 doesn't need this - its ABI can't be tweaked like IA-32 (can't use
> + * stdcall; the ABI uses 6 regparms already). */
> +# else
> +#  define FAST_FUNC
> +# endif
>  #endif
>
>  /* Make all declarations hidden (-fvisibility flag only affects definitions) */
> _______________________________________________
> busybox mailing list
> busybox at busybox.net
> http://lists.busybox.net/mailman/listinfo/busybox


More information about the busybox mailing list