MENU

华为鲲鹏服务器初探

February 4, 2020 • Read: 954 • 技术技巧

起因

报名了华为云微认证 轻松玩转Kubernetes ,需要一台北京四区的华为云 ECS 做客户机进行实验,发现华为云的 云创校园 活动新推出了鲲鹏云服务器套餐,通用计算增强型云服务器,搭载自研华为鲲鹏920处理器及25GE智能高速网卡,提供强劲鲲鹏算力和高性能网络,购买指定配置服务可享受9元/月优惠,并赠送相同时长主机安全,遂买来测试+实验。

2020-02-04-15-58-00-7d672dd008b7ab1c.png

这时博主干了一件蠢事,下单的时候居然买成了普通的云服务器套餐,我是要体验华为鲲鹏的呀!于是将计就计,再买一台鲲鹏的,和华为自家弹性服务器做一下对比测试。

本次测试采用 UnixBench 脚本。

UnixBench是一个类unix系(Unix,BSD,Linux)统下的性能测试工具,一个开源工具,被广泛用与测试linux系统主机的性能。Unixbench的主要测试项目有:系统调用、读写、进程、图形化测试、2D、3D、管道、运算、C库等系统基准性能提供测试数据。

测试方法:

wget --no-check-certificate https://github.com/teddysun/across/raw/master/unixbench.sh
chmod +x unixbench.sh
./unixbench.sh

由于国内服务器访问 GitHub 有一定限制,测试时将脚本在本站资源站做了备份,直接从资源站调用。

wget --no-check-certificate https://res.frytea.com/Bash/unixbench.sh
chmod +x unixbench.sh
./unixbench.sh

测试结果

华为云 Ecs ecs-sn3_medium_2_linux

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: ecs-sn3-medium-2-linux-20200204152547: GNU/Linux
   OS: GNU/Linux -- 3.10.0-1062.1.1.el7.x86_64 -- #1 SMP Fri Sep 13 22:55:44 UTC 2019
   Machine: x86_64 (x86_64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   CPU 0: Intel(R) Xeon(R) Gold 6161 CPU @ 2.20GHz (4400.0 bogomips)
          x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
   16:41:41 up  1:00,  1 user,  load average: 0.03, 0.03, 0.06; runlevel 3

------------------------------------------------------------------------
Benchmark Run: Tue Feb 04 2020 16:41:41 - 17:09:47
1 CPU in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       33259701.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3940.9 MWIPS (9.8 s, 7 samples)
Execl Throughput                               3746.7 lps   (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        771510.6 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          207548.2 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       2198798.7 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1156756.3 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 223226.3 lps   (10.0 s, 7 samples)
Process Creation                              14029.3 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   5686.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    763.8 lpm   (60.1 s, 2 samples)
System Call Overhead                        1000249.0 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   33259701.9   2850.0
Double-Precision Whetstone                       55.0       3940.9    716.5
Execl Throughput                                 43.0       3746.7    871.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     771510.6   1948.3
File Copy 256 bufsize 500 maxblocks            1655.0     207548.2   1254.1
File Copy 4096 bufsize 8000 maxblocks          5800.0    2198798.7   3791.0
Pipe Throughput                               12440.0    1156756.3    929.9
Pipe-based Context Switching                   4000.0     223226.3    558.1
Process Creation                                126.0      14029.3   1113.4
Shell Scripts (1 concurrent)                     42.4       5686.2   1341.1
Shell Scripts (8 concurrent)                      6.0        763.8   1273.1
System Call Overhead                          15000.0    1000249.0    666.8
                                                                   ========
System Benchmarks Index Score                                        1219.7



======= Script description and score comparison completed! =======

2020-02-04-17-12-39-a3a2911508b42b6c.png

华为云鲲鹏 ecs-kc1_small_1_linux

========================================================================
   BYTE UNIX Benchmarks (Version 5.1.3)

   System: ecs-kc1-small-1-linux-20200204152604: GNU/Linux
   OS: GNU/Linux -- 4.18.0-80.7.2.el7.aarch64 -- #1 SMP Thu Sep 12 16:13:20 UTC 2019
   Machine: aarch64 (aarch64)
   Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
   16:44:32 up 59 min,  1 user,  load average: 0.16, 0.03, 0.02; runlevel 3

------------------------------------------------------------------------
Benchmark Run: Tue Feb 04 2020 16:44:32 - 17:12:41
0 CPUs in system; running 1 parallel copy of tests

Dhrystone 2 using register variables       23810689.9 lps   (10.0 s, 7 samples)
Double-Precision Whetstone                     3484.7 MWIPS (10.0 s, 7 samples)
Execl Throughput                               4757.1 lps   (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks        513287.0 KBps  (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks          137669.4 KBps  (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks       1644878.0 KBps  (30.0 s, 2 samples)
Pipe Throughput                             1157950.1 lps   (10.0 s, 7 samples)
Pipe-based Context Switching                 248199.1 lps   (10.0 s, 7 samples)
Process Creation                               8687.8 lps   (30.0 s, 2 samples)
Shell Scripts (1 concurrent)                   6061.2 lpm   (60.0 s, 2 samples)
Shell Scripts (8 concurrent)                    806.8 lpm   (60.0 s, 2 samples)
System Call Overhead                        1016165.6 lps   (10.0 s, 7 samples)

System Benchmarks Index Values               BASELINE       RESULT    INDEX
Dhrystone 2 using register variables         116700.0   23810689.9   2040.3
Double-Precision Whetstone                       55.0       3484.7    633.6
Execl Throughput                                 43.0       4757.1   1106.3
File Copy 1024 bufsize 2000 maxblocks          3960.0     513287.0   1296.2
File Copy 256 bufsize 500 maxblocks            1655.0     137669.4    831.8
File Copy 4096 bufsize 8000 maxblocks          5800.0    1644878.0   2836.0
Pipe Throughput                               12440.0    1157950.1    930.8
Pipe-based Context Switching                   4000.0     248199.1    620.5
Process Creation                                126.0       8687.8    689.5
Shell Scripts (1 concurrent)                     42.4       6061.2   1429.5
Shell Scripts (8 concurrent)                      6.0        806.8   1344.7
System Call Overhead                          15000.0    1016165.6    677.4
                                                                   ========
System Benchmarks Index Score                                        1070.6



======= Script description and score comparison completed! =======

2020-02-04-17-14-02-6524113d8c35a8c6.png

总结

测试前使用其他脚本对两台服务器做过测试,其中 i/o 性能鲲鹏略好,下载速度也是鲲鹏稍快,其他参数差别不大。但从配置来看普通服务器是 1C2GB 而鲲鹏是 1C1GB,使用 UnixBench 测试后发现二者整体评分差别在1200分中的200分左右,浮点速率,函数速率等几乎持平,但 intel 确实略胜于鲲鹏。

这款搭载了 华为鲲鹏920处理器 的弹性云服务器整体表现可以说是与 Intel(R) Xeon(R) Gold 6161 CPU @ 2.20GHz 几乎持平,但文章 鲲鹏云服务器实战:华为云鲲鹏KC1实例 vs. 阿里云G5实例 直接对 CPU 进行跑分后发现华为云鲲鹏920处理器是要优于目前主流处理器的。

鲲鹏920处理器是华为在2019年1月发布的数据中心高性能处理器,由华为自主研发和设计,旨在满足数据中心多样性计算、绿色计算的需求 。鲲鹏920处理器兼容ARM架构,采用7nm工艺制造,可以支持32/48/64个内核,主频可达2.6GHz,支持8通道DDR4、PCIe 4.0和100G RoCE网络。

时间有限,以后再研究对这款处理器进行跑分,但无论如何,这款鲲鹏920处理器都给国人一剂强心剂,国产技术需要我们每一个人共同的推动,一起加油吧!

参考文献

Archives QR Code Tip
QR Code for this page
Tipping QR Code