Go 语言/golang 高性能编程,Go 语言进阶教程,Go 语言高性能编程(high performance go)。详细介绍如何测试/评估 Go 代码的性能,内容包括使用 testing 库进行基准测试(benchmark),性能分析(profiling) 编译优化(compiler optimisations),内存管理(memory management)和垃圾回收(garbage co
启动 CPU 分析时,运行时(runtime) 将每隔 10ms 中断一次,记录此时正在运行的协程(goroutines) 的堆栈信息。
程序运行结束后,可以分析记录的数据找到最热代码路径(hottest code paths)。
Compiler hot paths are code execution paths in the compiler in which most of the execution time is spent, and which are potentially executed very often. -- What's the meaning of “hot codepath”
// main.go
package main
import (
"math/rand"
"time"
)
func generate(n int) []int {
rand.Seed(time.Now().UnixNano())
nums := make([]int, 0)
for i := 0; i < n; i++ {
nums = append(nums, rand.Int())
}
return nums
}
func bubbleSort(nums []int) {
for i := 0; i < len(nums); i++ {
for j := 1; j < len(nums)-i; j++ {
if nums[j] < nums[j-1] {
nums[j], nums[j-1] = nums[j-1], nums[j]
}
}
}
}
func main() {
n := 10
for i := 0; i < 5; i++ {
nums := generate(n)
bubbleSort(nums)
n *= 10
}
}
import (
"math/rand"
"os"
"runtime/pprof"
"time"
)
func main() {
pprof.StartCPUProfile(os.Stdout)
defer pprof.StopCPUProfile()
n := 10
for i := 0; i < 5; i++ {
nums := generate(n)
bubbleSort(nums)
n *= 10
}
}
$ go run main.go > cpu.pprof
func main() {
f, _ := os.OpenFile("cpu.pprof", os.O_CREATE|os.O_RDWR, 0644)
defer f.Close()
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
n := 10
for i := 0; i < 5; i++ {
nums := generate(n)
bubbleSort(nums)
n *= 10
}
}
$ go tool pprof -http=:9999 cpu.pprof
$ go tool pprof cpu.pprof
File: main
Type: cpu
Time: Nov 19, 2020 at 1:43am (CST)
Duration: 16.42s, Total samples = 14.26s (86.83%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 14.14s, 99.16% of 14.26s total
Dropped 34 nodes (cum <= 0.07s)
flat flat% sum% cum cum%
14.14s 99.16% 99.16% 14.17s 99.37% main.bubbleSort
0 0% 99.16% 14.17s 99.37% main.main
0 0% 99.16% 14.17s 99.37% runtime.main
(pprof) top --cum
Showing nodes accounting for 14.14s, 99.16% of 14.26s total
Dropped 34 nodes (cum <= 0.07s)
flat flat% sum% cum cum%
14.14s 99.16% 99.16% 14.17s 99.37% main.bubbleSort
0 0% 99.16% 14.17s 99.37% main.main
0 0% 99.16% 14.17s 99.37% runtime.main
(pprof) help
Commands:
callgrind Outputs a graph in callgrind format
comments Output all profile comments
disasm Output assembly listings annotated with samples
dot Outputs a graph in DOT format
eog Visualize graph through eog
evince Visualize graph through evince
gif Outputs a graph image in GIF format
gv Visualize graph through gv
......
package main
import (
"github.com/pkg/profile"
"math/rand"
)
const letterBytes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
func randomString(n int) string {
b := make([]byte, n)
for i := range b {
b[i] = letterBytes[rand.Intn(len(letterBytes))]
}
return string(b)
}
func concat(n int) string {
s := ""
for i := 0; i < n; i++ {
s += randomString(n)
}
return s
}
func main() {
defer profile.Start(profile.MemProfile, profile.MemProfileRate(1)).Stop()
concat(100)
}
$ go run main.go
2020/11/22 18:38:29 profile: cpu profiling enabled, /tmp/profile068616584/cpu.pprof
2020/11/22 18:39:12 profile: cpu profiling disabled, /tmp/profile068616584/cpu.pprof
$ go run main.go
2020/11/22 18:59:04 profile: memory profiling enabled (rate 1), /tmp/profile215959616/mem.pprof
2020/11/22 18:59:04 profile: memory profiling disabled, /tmp/profile215959616/mem.pprof
go tool pprof -http=:9999 /tmp/profile215959616/mem.pprof
func concat(n int) string {
var s strings.Builder
for i := 0; i < n; i++ {
s.WriteString(randomString(n))
}
return s.String()
}
$ go run main.go
2020/11/22 19:17:55 profile: memory profiling enabled (rate 1), /tmp/profile061547314/mem.pprof
2020/11/22 19:17:55 profile: memory profiling disabled, /tmp/profile061547314/mem.pprof
$ go tool pprof /tmp/profile061547314/mem.pprof
File: main
Type: inuse_space
Time: Nov 22, 2020 at 7:17pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top --cum
Showing nodes accounting for 71.22kB, 89.01% of 80.01kB total
Dropped 25 nodes (cum <= 0.40kB)
Showing top 10 nodes out of 50
flat flat% sum% cum cum%
0 0% 0% 71.88kB 89.84% main.main
0 0% 0% 71.88kB 89.84% runtime.main
0 0% 0% 67.64kB 84.54% main.concat
45.77kB 57.20% 57.20% 45.77kB 57.20% strings.(*Builder).WriteString
21.88kB 27.34% 84.54% 21.88kB 27.34% main.randomString
func fib(n int) int {
if n == 0 || n == 1 {
return n
}
return fib(n-2) + fib(n-1)
}
func BenchmarkFib(b *testing.B) {
for n := 0; n < b.N; n++ {
fib(30) // run fib(30) b.N times
}
}
$ go test -bench="Fib$" -cpuprofile=cpu.pprof .
goos: linux
goarch: amd64
pkg: example
BenchmarkFib-8 196 6071636 ns/op
PASS
ok example 2.046s
$ go tool pprof -text cpu.pprof
File: example.test
Type: cpu
Time: Nov 22, 2020 at 7:52pm (CST)
Duration: 2.01s, Total samples = 1.77s (87.96%)
Showing nodes accounting for 1.77s, 100% of 1.77s total
flat flat% sum% cum cum%
1.76s 99.44% 99.44% 1.76s 99.44% example.fib
0.01s 0.56% 100% 0.01s 0.56% runtime.futex
0 0% 100% 1.76s 99.44% example.BenchmarkFib
0 0% 100% 0.01s 0.56% runtime.findrunnable
$ go tool pprof
Details:
Output formats (select at most one):
-dot Outputs a graph in DOT format
-png Outputs a graph image in PNG format
-text Outputs top entries in text form
-tree Outputs a text rendering of call graph
-web Visualize graph through web browser
...