Table of Contents
Question content
Solution
Home Backend Development Golang What causes Go's 4x performance loss on this array access microbenchmark (relative to GCC)?

What causes Go's 4x performance loss on this array access microbenchmark (relative to GCC)?

Feb 10, 2024 am 08:51 AM
go language Compile Error

在这个数组访问微基准测试中(相对于 GCC),Go 的性能损失了 4 倍,是什么原因造成的?

In this array access microbenchmark (relative to GCC), Go suffers a 4x performance loss. What causes this? This issue involves many aspects such as the runtime mechanism and compiler optimization of the Go language. First of all, the Go language uses a bounds check mechanism when accessing arrays, that is, bounds checks are performed every time an array element is accessed, which will cause a certain performance loss. Secondly, the Go language compiler is relatively weak in optimization and cannot optimize array access well. In addition, the garbage collection mechanism of the Go language will also have a certain impact on performance. Taken together, these factors combined to cause Go to suffer a 4x performance loss in the array access microbenchmark.

Question content

I wrote this microbenchmark to better understand go's performance characteristics so that I can make informed choices about when to use it.

From a performance overhead perspective, I think this would be the ideal scenario for go:

  • No allocation/release inside the loop
  • Array access is obviously within bounds (bounds check can be removed)

Nonetheless, I saw a 4x speed difference relative to gcc -o3 on amd64. why is that?

(Use shell timing. It takes a few seconds each time, so startup can be ignored)

package main

import "fmt"

func main() {
    fmt.println("started");

    var n int32 = 1024 * 32

    a := make([]int32, n, n)
    b := make([]int32, n, n)

    var it, i, j int32

    for i = 0; i < n; i++ {
        a[i] =  i
        b[i] = -i
    }

    var r int32 = 10
    var sum int32 = 0

    for it = 0; it < r; it++ {
        for i = 0; i < n; i++ {
            for j = 0; j < n; j++ {
                sum += (a[i] + b[j]) * (it + 1)
            }
        }
    }
    fmt.printf("n = %d, r = %d, sum = %d\n", n, r, sum)
}
Copy after login

c Version:

#include <stdio.h>
#include <stdlib.h>


int main() {
    printf("started\n");

    int32_t n = 1024 * 32;

    int32_t* a = malloc(sizeof(int32_t) * n);
    int32_t* b = malloc(sizeof(int32_t) * n);

    for(int32_t i = 0; i < n; ++i) {
        a[i] =  i;
        b[i] = -i;
    }

    int32_t r = 10;
    int32_t sum = 0;

    for(int32_t it = 0; it < r; ++it) {
        for(int32_t i = 0; i < n; ++i) {
            for(int32_t j = 0; j < n; ++j) {
                sum += (a[i] + b[j]) * (it + 1);
            }
        }
    }
    printf("n = %d, r = %d, sum = %d\n", n, r, sum);

    free(a);
    free(b);
}
Copy after login

renew:

  • Use range as recommended to increase go speed by 2 times.
  • On the other hand, -march=native made c 2x faster in my tests. (And -mno-sse gives a compilation error, apparently incompatible with -o3)
  • gccgo looks equivalent to gcc here (and doesn't require range)

Solution

Look at the assembler output of the C program and the Go program, at least on the versions of Go and GCC I am using (1.19.6 and 12.2.0 respectively) , the most direct and obvious difference is that GCC automatically vectorizes C programs, while the Go compiler seems unable to do this.

This also nicely explains why you would see a quadruple performance increase, as GCC uses SSE instead of AVX when not targeting a specific architecture, which means the 32-bit scalar instruction width is four times the operating width. In fact, adding -march=native gave me a 2x performance improvement because it made GCC output AVX code on my CPU.

I'm not familiar enough with Go to tell you whether the Go compiler is intrinsically unable to do autovectorization, or if it's just this particular program that's causing it to error for some reason, but that seems to be the root cause.

The above is the detailed content of What causes Go's 4x performance loss on this array access microbenchmark (relative to GCC)?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1666
14
PHP Tutorial
1272
29
C# Tutorial
1251
24
How to solve the user_id type conversion problem when using Redis Stream to implement message queues in Go language? How to solve the user_id type conversion problem when using Redis Stream to implement message queues in Go language? Apr 02, 2025 pm 04:54 PM

The problem of using RedisStream to implement message queues in Go language is using Go language and Redis...

Function name definition in c language Function name definition in c language Apr 03, 2025 pm 10:03 PM

The C language function name definition includes: return value type, function name, parameter list and function body. Function names should be clear, concise and unified in style to avoid conflicts with keywords. Function names have scopes and can be used after declaration. Function pointers allow functions to be passed or assigned as arguments. Common errors include naming conflicts, mismatch of parameter types, and undeclared functions. Performance optimization focuses on function design and implementation, while clear and easy-to-read code is crucial.

What should I do if the custom structure labels in GoLand are not displayed? What should I do if the custom structure labels in GoLand are not displayed? Apr 02, 2025 pm 05:09 PM

What should I do if the custom structure labels in GoLand are not displayed? When using GoLand for Go language development, many developers will encounter custom structure tags...

What is the problem with Queue thread in Go's crawler Colly? What is the problem with Queue thread in Go's crawler Colly? Apr 02, 2025 pm 02:09 PM

Queue threading problem in Go crawler Colly explores the problem of using the Colly crawler library in Go language, developers often encounter problems with threads and request queues. �...

What libraries are used for floating point number operations in Go? What libraries are used for floating point number operations in Go? Apr 02, 2025 pm 02:06 PM

The library used for floating-point number operation in Go language introduces how to ensure the accuracy is...

Which libraries in Go are developed by large companies or provided by well-known open source projects? Which libraries in Go are developed by large companies or provided by well-known open source projects? Apr 02, 2025 pm 04:12 PM

Which libraries in Go are developed by large companies or well-known open source projects? When programming in Go, developers often encounter some common needs, ...

In Go programming, how to correctly manage the connection and release resources between Mysql and Redis? In Go programming, how to correctly manage the connection and release resources between Mysql and Redis? Apr 02, 2025 pm 05:03 PM

Resource management in Go programming: Mysql and Redis connect and release in learning how to correctly manage resources, especially with databases and caches...

Do I need to install an Oracle client when connecting to an Oracle database using Go? Do I need to install an Oracle client when connecting to an Oracle database using Go? Apr 02, 2025 pm 03:48 PM

Do I need to install an Oracle client when connecting to an Oracle database using Go? When developing in Go, connecting to Oracle databases is a common requirement...

See all articles