Remix.run Logo
dfawcus 2 days ago

On point [3], using the Shane's code, and with an additional one for gccgo, on my laptop I see:

    $ go-11 test -cpu=1,2,4,8,16 -bench Cgo
    goos: linux
    goarch: amd64
    pkg: github.com/shanemhansen/cgobench
    cpu: 13th Gen Intel(R) Core(TM) i5-1340P
    BenchmarkCgoCall         6123471        195.8 ns/op
    BenchmarkCgoCall-2      11794101         97.74 ns/op
    BenchmarkCgoCall-4      22250806         51.30 ns/op
    BenchmarkCgoCall-8      33147904         34.16 ns/op
    BenchmarkCgoCall-16     53388628         22.41 ns/op
    PASS
    ok   github.com/shanemhansen/cgobench 6.364s

    $ go-11 test -cpu=1,2,4,8,16 -bench Gcc
    goos: linux
    goarch: amd64
    pkg: github.com/shanemhansen/cgobench
    cpu: 13th Gen Intel(R) Core(TM) i5-1340P
    BenchmarkGccCall        414216266          3.037 ns/op
    BenchmarkGccCall-2      788898944          1.523 ns/op
    BenchmarkGccCall-4      1000000000          0.7670 ns/op
    BenchmarkGccCall-8      1000000000          0.4909 ns/op
    BenchmarkGccCall-16     1000000000          0.3488 ns/op
    PASS
    ok   github.com/shanemhansen/cgobench 4.806s

    $ go-11 test -cpu=1,2,4,8,16 -bench EmptyCall
    goos: linux
    goarch: amd64
    pkg: github.com/shanemhansen/cgobench
    cpu: 13th Gen Intel(R) Core(TM) i5-1340P
    BenchmarkEmptyCallInlineable        1000000000          0.5483 ns/op
    BenchmarkEmptyCallInlineable-2      1000000000          0.2752 ns/op
    BenchmarkEmptyCallInlineable-4      1000000000          0.1463 ns/op
    BenchmarkEmptyCallInlineable-8      1000000000          0.1295 ns/op
    BenchmarkEmptyCallInlineable-16     1000000000          0.1225 ns/op
    BenchmarkEmptyCall                  499314484          2.401 ns/op
    BenchmarkEmptyCall-2                977968472          1.202 ns/op
    BenchmarkEmptyCall-4                1000000000          0.6316 ns/op
    BenchmarkEmptyCall-8                1000000000          0.4111 ns/op
    BenchmarkEmptyCall-16               1000000000          0.2765 ns/op
    PASS
    ok   github.com/shanemhansen/cgobench 5.707s

Hence the GccGo version of calling the C function is in the same ballpark as for a native Go function call. This is as to be expected when using that mechanism.

So using various C libraries does not necessarily have to involve the overhead from Cgo.

    diff --git a/bench.go b/bench.go
    index 8852c75..7bfd870 100644
    --- a/bench.go
    +++ b/bench.go
    @@ -15,3 +15,10 @@ func Call() {
     func CgoCall() {
            C.trivial_add(1,2)
     }
    +
    +//go:linkname c_trivial_add trivial_add
    +func c_trivial_add(a int, b int) int
    +
    +func GccCall() {
    +       c_trivial_add(1,2)
    +}
    diff --git a/bench_test.go b/bench_test.go
    index 9523668..c390c63 100644
    --- a/bench_test.go
    +++ b/bench_test.go
    @@ -43,3 +43,6 @@ func BenchmarkEmptyCall(b *testing.B) {
     func BenchmarkCgoCall(b *testing.B) {
            pbench(b, CgoCall)
     }
    +func BenchmarkGccCall(b *testing.B) {
    +       pbench(b, GccCall)
    +}