寶山做網(wǎng)站公司南陽網(wǎng)站seo
如何使用gprof對程序進行性能分析
目錄
1 gprof概述
2 gprof原理簡述
3 gprof使用
??3.1 gprof使用簡述
??3.2 gprof使用示例
4 小結(jié)
1 gprof概述
????gprof 是 一個 GNU 的程序性能分析工具,可以用于分析C\C++程序的執(zhí)行性能。gprof工具可以統(tǒng)計出各個函數(shù)的調(diào)用次數(shù)、執(zhí)行時間、函數(shù)調(diào)用關(guān)系,具體功能可以通過 man gprof進一步了解。通常該工具在大多數(shù)linux內(nèi)核的發(fā)行版本中會在你安裝C/C++編譯器的同時默認安裝上。
2 gprof原理簡述
????通過在編譯、鏈接的時候使用 -pg 選項,就能夠控制gcc/g++ 編譯器在程序的每個函數(shù)中插入插樁函數(shù),從而跟蹤目標函數(shù)的執(zhí)行時間、調(diào)用關(guān)系(通過函數(shù)調(diào)用堆棧查找)、執(zhí)行次數(shù)等信息,并將相關(guān)數(shù)據(jù)保存到 gmon.out 文件中。
????【注意】: 必須是編譯和鏈接的時候都要添加 -pg 參數(shù)。并且目標程序不能使用 strip 命令去除符號表,否則 gprof 工具無法正常處理 gmon.out 到 profile.txt文件。
3 gprof使用
3.1 gprof使用簡述
????使用cmake,在CMakeList.txt中添加
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -pg")
SET(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -pg")
SET(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -pg")
SET(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -pg")
????使用make,在Makefile中設(shè)置
CFLAGS += -pg
CPPFLAGS += -pg
LDFLAGS += -pg
b、 編譯后正常運行程序
.testApp arg1 arg2
【注意】 運行程序后,要程序正常退出,才能正常生成 gmon.out 文件;在此步驟中程序是可以帶命令行參數(shù)執(zhí)行的。
c、 分析、收集數(shù)據(jù)
gprof testApp gmon.out > profile.txt
【注意】 在此步驟中,目標程序如果是帶參的,此步驟不可以填入命令行參數(shù)。
d、 分析數(shù)據(jù)圖形化
gprof2dot -e0 -n0 profile.txt > profile.dot
dot profile.dot -Tpng -o profile.png
【說明】 在此步驟中,需要通過 gprof2dot 和 dot工具將結(jié)果圖形化,方便查看。
3.2 gprof使用示例
a、 clone s_log_safe項目源碼
git clone https://github.com/smallerxuan/s_log_safe.gitcd s_log_safetree
clone完成后進入路徑,能看到如下的目錄結(jié)構(gòu):
b、 修改Makefile文件
gedit ./Makefile
用編輯器打開 Makefile文件,修改 FLAGS_BASE 關(guān)閉O2優(yōu)化,添加 -pg 選項;給 LDFLAGS 追加 -pg;注釋 strip 調(diào)用。
c、 調(diào)整測試用例并編譯項目
gedit ./main.c
用編輯器打開 main.c 文件,對測試程序進行簡單修改,不然測試程序不會自然結(jié)束。
#include <stdio.h>
#include "s_log_safe.h"static int test_mark = 8;void* thread_1_exec(void* arg)
{int* count = (int*)arg;s_safe_tag("thread_1", S_LOG_SAFE_OPT_DEBUG);do {s_log_safe_a("%s %d","測試1",*count);usleep(5000);} while (*count -= 1);test_mark -= 1;return s_log_safe_null;
}void* thread_2_exec(void* arg)
{int* count = (int*)arg;s_safe_tag("thread_2", S_LOG_SAFE_OPT_DEBUG);do {s_log_safe_v("%s %d","測試2",*count);usleep(5000);} while (*count -= 1);test_mark -= 1;return s_log_safe_null;
}void* thread_3_exec(void* arg)
{int* count = (int*)arg;s_safe_tag("thread_3", S_LOG_SAFE_OPT_DEBUG);do {s_log_safe_e("%s %d","測試3",*count);usleep(5000);} while (*count -= 1);test_mark -= 1;return s_log_safe_null;
}void* thread_4_exec(void* arg)
{int* count = (int*)arg;s_safe_tag("thread_4", S_LOG_SAFE_OPT_DEBUG);do {s_log_safe_w("%s %d","測試4",*count);usleep(5000);} while (*count -= 1);test_mark -= 1;return s_log_safe_null;
}void* thread_5_exec(void* arg)
{int* count = (int*)arg;s_safe_tag("thread_5", S_LOG_SAFE_OPT_DEBUG);do {s_log_safe_t("%s %d","測試5",*count);usleep(5000);} while (*count -= 1);test_mark -= 1;return s_log_safe_null;
}void* thread_6_exec(void* arg)
{int* count = (int*)arg;s_safe_tag("thread_6", S_LOG_SAFE_OPT_DEBUG);do {s_log_safe_i("%s %d","測試6",*count);usleep(5000);} while (*count -= 1);test_mark -= 1;return s_log_safe_null;
}void* thread_7_exec(void* arg)
{int* count = (int*)arg;s_safe_tag("thread_7", S_LOG_SAFE_OPT_TRACE);do {s_log_safe_d("%s %d","測試7",*count);if((*count)%8 == 0) {s_safe_tag_log_level_limit_set(S_LOG_SAFE_OPT_DEBUG);} else {s_safe_tag_log_level_limit_set(S_LOG_SAFE_OPT_TRACE);}usleep(5000);} while (*count -= 1);test_mark -= 1;return s_log_safe_null;
}void* thread_main_exec(void* arg)
{int count = 0;unsigned int log_safe_pool_cap = 0;unsigned int log_safe_pool_used = 0;s_safe_tag("main", S_LOG_SAFE_OPT_DEBUG);log_safe_pool_cap = s_log_safe_output_pool_cap_get();while(test_mark > 1) {log_safe_pool_used = s_log_safe_output_pool_used_get();s_log_safe_i("%s log_safe_pool_cap:%d log_safe_pool_used:%d count:%d","main", log_safe_pool_cap, log_safe_pool_used, count++);usleep(5000);}test_mark -= 1;return s_log_safe_null;
}int main(void)
{int ret = 0;s_log_safe_thread_t* s_log_safe_thread_1_p;s_log_safe_thread_t* s_log_safe_thread_2_p;s_log_safe_thread_t* s_log_safe_thread_3_p;s_log_safe_thread_t* s_log_safe_thread_4_p;s_log_safe_thread_t* s_log_safe_thread_5_p;s_log_safe_thread_t* s_log_safe_thread_6_p;s_log_safe_thread_t* s_log_safe_thread_7_p;s_log_safe_thread_t* s_log_safe_thread_main_p;int count_1 = 77;int count_2 = 66;int count_3 = 55;int count_4 = 44;int count_5 = 33;int count_6 = 22;int count_7 = 11;ret = s_log_safe_init();if(ret != 0) {return 0;}s_log_safe_thread_7_p = s_log_safe_thread_create(thread_main_exec, (void*)s_log_safe_null, "", S_LOG_SAFE_THREAD_PRIORITY, 1024);s_log_safe_thread_1_p = s_log_safe_thread_create(thread_1_exec, (void*)&count_1, "", 10, 1024);s_log_safe_thread_2_p = s_log_safe_thread_create(thread_2_exec, (void*)&count_2, "", 10, 1024);s_log_safe_thread_3_p = s_log_safe_thread_create(thread_3_exec, (void*)&count_3, "", 12, 1024);s_log_safe_thread_4_p = s_log_safe_thread_create(thread_4_exec, (void*)&count_4, "", 12, 1024);s_log_safe_thread_5_p = s_log_safe_thread_create(thread_5_exec, (void*)&count_5, "", 11, 1024);s_log_safe_thread_6_p = s_log_safe_thread_create(thread_6_exec, (void*)&count_6, "", 11, 1024);s_log_safe_thread_7_p = s_log_safe_thread_create(thread_7_exec, (void*)&count_7, "", 10, 1024);while(test_mark != 0) {sleep(1);}return 0;
}
修改完成后運行make命令編譯測試程序。
make
運行 make 命令后,會在 ./buld 路徑生成目標測試程序。
d、 運測試程序
cd ./build/./s_log_safe_test
切換到 ./build 路徑后執(zhí)行測試程序,通過 ls 會在路徑下發(fā)現(xiàn)新生成了一個 gmon.out 文件。
e、 分析、收集數(shù)據(jù)
gprof s_log_safe_test gmon.out > profile.txt
通過該命令,可以在路徑下看見導(dǎo)出的分析結(jié)果文件 profile.txt。在該文件中,詳細的記錄了 函數(shù)的執(zhí)行時間、調(diào)用關(guān)系、執(zhí)行次數(shù)等信息。但是還不是特別方便查看,畢竟看圖會更直觀。
profile.txt文件的內(nèi)容如下:
Flat profile:Each sample counts as 0.01 seconds.no time accumulated% cumulative self self total time seconds seconds calls Ts/call Ts/call name 0.00 0.00 0.00 2048 0.00 0.00 s_log_safe_mutex_unlock0.00 0.00 0.00 2048 0.00 0.00 s_ring_buffer_unlock0.00 0.00 0.00 1294 0.00 0.00 s_log_safe_mutex_lock0.00 0.00 0.00 1294 0.00 0.00 s_ring_buffer_lock0.00 0.00 0.00 752 0.00 0.00 s_log_safe_mutex_try_lock0.00 0.00 0.00 752 0.00 0.00 s_log_strrchr0.00 0.00 0.00 751 0.00 0.00 s_ring_buffer_try_lock0.00 0.00 0.00 544 0.00 0.00 s_ring_buffer_could_read_num_get0.00 0.00 0.00 383 0.00 0.00 s_log_safe_out0.00 0.00 0.00 376 0.00 0.00 s_log_out_by_printf0.00 0.00 0.00 376 0.00 0.00 s_log_print0.00 0.00 0.00 376 0.00 0.00 s_log_safe_output0.00 0.00 0.00 376 0.00 0.00 s_ring_buffer_read_elements0.00 0.00 0.00 374 0.00 0.00 s_ring_buffer_write_elements0.00 0.00 0.00 78 0.00 0.00 s_log_safe_output_pool_used_get0.00 0.00 0.00 9 0.00 0.00 get_thread_policy0.00 0.00 0.00 9 0.00 0.00 s_log_safe_thread_create0.00 0.00 0.00 2 0.00 0.00 s_log_safe_mutex_create0.00 0.00 0.00 2 0.00 0.00 s_ring_buffer_lock_create0.00 0.00 0.00 1 0.00 0.00 s_log_safe_constructor0.00 0.00 0.00 1 0.00 0.00 s_log_safe_init0.00 0.00 0.00 1 0.00 0.00 s_log_safe_output_pool_cap_get0.00 0.00 0.00 1 0.00 0.00 s_ring_buffer_constructor0.00 0.00 0.00 1 0.00 0.00 s_ring_buffer_element_pool_constructor_malloc% the percentage of the total running time of the
time program used by this function.cumulative a running sum of the number of seconds accountedseconds for by this function and those listed above it.self the number of seconds accounted for by this
seconds function alone. This is the major sort for thislisting.calls the number of times this function was invoked, ifthis function is profiled, else blank.self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,else blank.total the average number of milliseconds spent in this
ms/call function and its descendents per call, if thisfunction is profiled, else blank.name the name of the function. This is the minor sortfor this listing. The index shows the location ofthe function in the gprof listing. If the index isin parenthesis it shows where it would appear inthe gprof listing if it were to be printed.Copyright (C) 2012-2015 Free Software Foundation, Inc.Copying and distribution of this file, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved.Call graph (explanation follows)granularity: each sample hit covers 2 byte(s) no time propagatedindex % time self children called name0.00 0.00 2048/2048 s_ring_buffer_unlock [2]
[1] 0.0 0.00 0.00 2048 s_log_safe_mutex_unlock [1]
-----------------------------------------------0.00 0.00 2/2048 s_ring_buffer_constructor [23]0.00 0.00 544/2048 s_ring_buffer_could_read_num_get [8]0.00 0.00 750/2048 s_ring_buffer_write_elements [14]0.00 0.00 752/2048 s_ring_buffer_read_elements [13]
[2] 0.0 0.00 0.00 2048 s_ring_buffer_unlock [2]0.00 0.00 2048/2048 s_log_safe_mutex_unlock [1]
-----------------------------------------------0.00 0.00 1294/1294 s_ring_buffer_lock [4]
[3] 0.0 0.00 0.00 1294 s_log_safe_mutex_lock [3]
-----------------------------------------------0.00 0.00 374/1294 s_ring_buffer_write_elements [14]0.00 0.00 376/1294 s_ring_buffer_read_elements [13]0.00 0.00 544/1294 s_ring_buffer_could_read_num_get [8]
[4] 0.0 0.00 0.00 1294 s_ring_buffer_lock [4]0.00 0.00 1294/1294 s_log_safe_mutex_lock [3]
-----------------------------------------------0.00 0.00 752/752 s_ring_buffer_try_lock [7]
[5] 0.0 0.00 0.00 752 s_log_safe_mutex_try_lock [5]
-----------------------------------------------0.00 0.00 752/752 s_log_safe_output [12]
[6] 0.0 0.00 0.00 752 s_log_strrchr [6]
-----------------------------------------------0.00 0.00 375/751 s_ring_buffer_write_elements [14]0.00 0.00 376/751 s_ring_buffer_read_elements [13]
[7] 0.0 0.00 0.00 751 s_ring_buffer_try_lock [7]0.00 0.00 752/752 s_log_safe_mutex_try_lock [5]
-----------------------------------------------0.00 0.00 78/544 s_log_safe_output_pool_used_get [15]0.00 0.00 466/544 s_log_safe_thread_exec_func [36]
[8] 0.0 0.00 0.00 544 s_ring_buffer_could_read_num_get [8]0.00 0.00 544/1294 s_ring_buffer_lock [4]0.00 0.00 544/2048 s_ring_buffer_unlock [2]
-----------------------------------------------0.00 0.00 11/383 thread_7_exec [50]0.00 0.00 22/383 thread_6_exec [49]0.00 0.00 33/383 thread_5_exec [48]0.00 0.00 43/383 thread_4_exec [47]0.00 0.00 55/383 thread_3_exec [46]0.00 0.00 66/383 thread_2_exec [45]0.00 0.00 76/383 thread_1_exec [44]0.00 0.00 77/383 thread_main_exec [51]
[9] 0.0 0.00 0.00 383 s_log_safe_out [9]0.00 0.00 374/374 s_ring_buffer_write_elements [14]
-----------------------------------------------0.00 0.00 376/376 s_log_print [11]
[10] 0.0 0.00 0.00 376 s_log_out_by_printf [10]
-----------------------------------------------0.00 0.00 376/376 s_log_safe_output [12]
[11] 0.0 0.00 0.00 376 s_log_print [11]0.00 0.00 376/376 s_log_out_by_printf [10]
-----------------------------------------------0.00 0.00 376/376 s_log_safe_thread_exec_func [36]
[12] 0.0 0.00 0.00 376 s_log_safe_output [12]0.00 0.00 752/752 s_log_strrchr [6]0.00 0.00 376/376 s_ring_buffer_read_elements [13]0.00 0.00 376/376 s_log_print [11]
-----------------------------------------------0.00 0.00 376/376 s_log_safe_output [12]
[13] 0.0 0.00 0.00 376 s_ring_buffer_read_elements [13]0.00 0.00 752/2048 s_ring_buffer_unlock [2]0.00 0.00 376/751 s_ring_buffer_try_lock [7]0.00 0.00 376/1294 s_ring_buffer_lock [4]
-----------------------------------------------0.00 0.00 374/374 s_log_safe_out [9]
[14] 0.0 0.00 0.00 374 s_ring_buffer_write_elements [14]0.00 0.00 750/2048 s_ring_buffer_unlock [2]0.00 0.00 375/751 s_ring_buffer_try_lock [7]0.00 0.00 374/1294 s_ring_buffer_lock [4]
-----------------------------------------------0.00 0.00 78/78 thread_main_exec [51]
[15] 0.0 0.00 0.00 78 s_log_safe_output_pool_used_get [15]0.00 0.00 78/544 s_ring_buffer_could_read_num_get [8]
-----------------------------------------------0.00 0.00 9/9 s_log_safe_thread_create [17]
[16] 0.0 0.00 0.00 9 get_thread_policy [16]
-----------------------------------------------0.00 0.00 1/9 s_log_safe_init [21]0.00 0.00 8/9 main [30]
[17] 0.0 0.00 0.00 9 s_log_safe_thread_create [17]0.00 0.00 9/9 get_thread_policy [16]
-----------------------------------------------0.00 0.00 2/2 s_ring_buffer_lock_create [19]
[18] 0.0 0.00 0.00 2 s_log_safe_mutex_create [18]
-----------------------------------------------0.00 0.00 2/2 s_ring_buffer_constructor [23]
[19] 0.0 0.00 0.00 2 s_ring_buffer_lock_create [19]0.00 0.00 2/2 s_log_safe_mutex_create [18]
-----------------------------------------------0.00 0.00 1/1 s_log_safe_init [21]
[20] 0.0 0.00 0.00 1 s_log_safe_constructor [20]0.00 0.00 1/1 s_ring_buffer_constructor [23]
-----------------------------------------------0.00 0.00 1/1 main [30]
[21] 0.0 0.00 0.00 1 s_log_safe_init [21]0.00 0.00 1/1 s_log_safe_constructor [20]0.00 0.00 1/9 s_log_safe_thread_create [17]
-----------------------------------------------0.00 0.00 1/1 thread_main_exec [51]
[22] 0.0 0.00 0.00 1 s_log_safe_output_pool_cap_get [22]
-----------------------------------------------0.00 0.00 1/1 s_log_safe_constructor [20]
[23] 0.0 0.00 0.00 1 s_ring_buffer_constructor [23]0.00 0.00 2/2 s_ring_buffer_lock_create [19]0.00 0.00 2/2048 s_ring_buffer_unlock [2]0.00 0.00 1/1 s_ring_buffer_element_pool_constructor_malloc [24]
-----------------------------------------------0.00 0.00 1/1 s_ring_buffer_constructor [23]
[24] 0.0 0.00 0.00 1 s_ring_buffer_element_pool_constructor_malloc [24]
-----------------------------------------------This table describes the call tree of the program, and was sorted bythe total amount of time spent in each function and its children.Each entry in this table consists of several lines. The line with theindex number at the left hand margin lists the current function.The lines above it list the functions that called this function,and the lines below it list the functions this one called.This line lists:index A unique number given to each element of the table.Index numbers are sorted numerically.The index number is printed next to every function name soit is easier to look up where the function is in the table.% time This is the percentage of the `total' time that was spentin this function and its children. Note that due todifferent viewpoints, functions excluded by options, etc,these numbers will NOT add up to 100%.self This is the total amount of time spent in this function.children This is the total amount of time propagated into thisfunction by its children.called This is the number of times the function was called.If the function called itself recursively, the numberonly includes non-recursive calls, and is followed bya `+' and the number of recursive calls.name The name of the current function. The index number isprinted after it. If the function is a member of acycle, the cycle number is printed between thefunction's name and the index number.For the function's parents, the fields have the following meanings:self This is the amount of time that was propagated directlyfrom the function into this parent.children This is the amount of time that was propagated fromthe function's children into this parent.called This is the number of times this parent called thefunction `/' the total number of times the functionwas called. Recursive calls to the function are notincluded in the number after the `/'.name This is the name of the parent. The parent's indexnumber is printed after it. If the parent is amember of a cycle, the cycle number is printed betweenthe name and the index number.If the parents of the function cannot be determined, the word`<spontaneous>' is printed in the `name' field, and all the otherfields are blank.For the function's children, the fields have the following meanings:self This is the amount of time that was propagated directlyfrom the child into the function.children This is the amount of time that was propagated from thechild's children to the function.called This is the number of times the function calledthis child `/' the total number of times the childwas called. Recursive calls by the child are notlisted in the number after the `/'.name This is the name of the child. The child's indexnumber is printed after it. If the child is amember of a cycle, the cycle number is printedbetween the name and the index number.If there are any cycles (circles) in the call graph, there is anentry for the cycle-as-a-whole. This entry shows who called thecycle (as parents) and the members of the cycle (as children.)The `+' recursive calls entry shows the number of function calls thatwere internal to the cycle, and the calls entry for each member shows,for that member, how many times it was called from other members ofthe cycle.Copyright (C) 2012-2015 Free Software Foundation, Inc.Copying and distribution of this file, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved.Index by function name[16] get_thread_policy (s_log_safe_platform.c) [1] s_log_safe_mutex_unlock [8] s_ring_buffer_could_read_num_get[10] s_log_out_by_printf [9] s_log_safe_out [24] s_ring_buffer_element_pool_constructor_malloc[11] s_log_print [12] s_log_safe_output (s_log_safe.c) [4] s_ring_buffer_lock[20] s_log_safe_constructor (s_log_safe.c) [22] s_log_safe_output_pool_cap_get [19] s_ring_buffer_lock_create[21] s_log_safe_init [15] s_log_safe_output_pool_used_get [13] s_ring_buffer_read_elements[18] s_log_safe_mutex_create [17] s_log_safe_thread_create [7] s_ring_buffer_try_lock[3] s_log_safe_mutex_lock [6] s_log_strrchr [2] s_ring_buffer_unlock[5] s_log_safe_mutex_try_lock [23] s_ring_buffer_constructor [14] s_ring_buffer_write_elements
f、 生成調(diào)用圖
gprof2dot -e0 -n0 profile.txt > profile.dot
dot profile.dot -Tpng -o profile.png
通過 gprof2dot 和 dot 工具能夠?qū)?profile.txt 文件轉(zhuǎn)換為 更直觀的 圖片。
如果沒有安裝 gprof2dot 和 dot 工具,可以通過以下命令進行安裝:
sudo apt-get install graphvizpip3 install gprof2dot
g、 查看調(diào)用圖
通過上述操作,最終生成了 profile.png 文件,最終的調(diào)用圖如下圖所示:
從該圖中,就比較直觀的看到了調(diào)用流程和調(diào)用次數(shù)