Go 语言/golang 高性能编程,Go 语言进阶教程,Go 语言高性能编程(high performance go)。详细介绍如何测试/评估 Go 代码的性能,内容包括使用 testing 库进行基准测试(benchmark),性能分析(profiling) 编译优化(compiler optimisations),内存管理(memory management)和垃圾回收(garbage co
启动 CPU 分析时,运行时(runtime) 将每隔 10ms 中断一次,记录此时正在运行的协程(goroutines) 的堆栈信息。
程序运行结束后,可以分析记录的数据找到最热代码路径(hottest code paths)。
Compiler hot paths are code execution paths in the compiler in which most of the execution time is spent, and which are potentially executed very often. -- What's the meaning of “hot codepath”
Go 的运行时性能分析接口都位于 runtime/pprof 包中。只需要调用 runtime/pprof 库即可得到我们想要的数据。
假设我们实现了这么一个程序,随机生成了 5 组数据,并且使用冒泡排序法排序。
// main.gopackagemainimport ("math/rand""time")funcgenerate(n int) []int { rand.Seed(time.Now().UnixNano()) nums :=make([]int, 0)for i :=0; i < n; i++ { nums =append(nums, rand.Int()) }return nums}funcbubbleSort(nums []int) {for i :=0; i <len(nums); i++ {for j :=1; j <len(nums)-i; j++ {if nums[j] < nums[j-1] { nums[j], nums[j-1] = nums[j-1], nums[j] } } }}funcmain() { n :=10for i :=0; i <5; i++ { nums :=generate(n)bubbleSort(nums) n *=10 }}
如果我们想度量这个应用程序的 CPU 性能数据,只需要在 main 函数中添加 2 行代码即可:
import ("math/rand""os""runtime/pprof""time")funcmain() { pprof.StartCPUProfile(os.Stdout)defer pprof.StopCPUProfile() n :=10for i :=0; i <5; i++ { nums :=generate(n)bubbleSort(nums) n *=10 }}
$gotoolpprofcpu.pprofFile:mainType:cpuTime:Nov19,2020at1:43am (CST)Duration:16.42s,Totalsamples=14.26s (86.83%)Enteringinteractivemode (type "help"forcommands,"o"foroptions)(pprof) topShowingnodesaccountingfor14.14s,99.16%of14.26stotalDropped34nodes (cum <=0.07s)flatflat%sum%cumcum%14.14s99.16%99.16%14.17s99.37%main.bubbleSort00%99.16%14.17s99.37%main.main00%99.16%14.17s99.37%runtime.main
可以看到 main.bubbleSort 是消耗 CPU 最多的函数。
还可以按照 cum (累计消耗)排序:
(pprof) top--cumShowingnodesaccountingfor14.14s,99.16%of14.26stotalDropped34nodes (cum <=0.07s)flatflat%sum%cumcum%14.14s99.16%99.16%14.17s99.37%main.bubbleSort00%99.16%14.17s99.37%main.main00%99.16%14.17s99.37%runtime.main
packagemainimport ("github.com/pkg/profile""math/rand")constletterBytes="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"funcrandomString(n int) string { b :=make([]byte, n)for i :=range b { b[i] = letterBytes[rand.Intn(len(letterBytes))] }returnstring(b)}funcconcat(n int) string { s :=""for i :=0; i < n; i++ { s +=randomString(n) }return s}funcmain() {defer profile.Start(profile.MemProfile, profile.MemProfileRate(1)).Stop()concat(100)}
funcfib(n int) int {if n ==0|| n ==1 {return n }returnfib(n-2) +fib(n-1)}funcBenchmarkFib(b *testing.B) {for n :=0; n < b.N; n++ {fib(30) // run fib(30) b.N times }}
只需要在 go test 添加 -cpuprofile 参数即可生成 BenchmarkFib 对应的 CPU profile 文件:
pprof 支持多种输出格式(图片、文本、Web等),直接在命令行中运行 go tool pprof 即可看到所有支持的选项:
$ go tool pprof
Details:
Output formats (select at most one):
-dot Outputs a graph in DOT format
-png Outputs a graph image in PNG format
-text Outputs top entries in text form
-tree Outputs a text rendering of call graph
-web Visualize graph through web browser
...