Showing posts with label Linux. Show all posts
Showing posts with label Linux. Show all posts

How to enable TCP BBR2 network congestion control on Windows 11

 Generally, when we use the Linux series operating system, we will install Google's BBR network congestion control software. BBR v2 version can now be enabled on Windows 11.

BBRv2 is a model-based congestion control algorithm designed for reduced queuing, low loss, and (bounded) Reno/CUBIC coexistence. Maintains a model network path using measurements of bandwidth and RTT, and (if occurring) packet loss and/or DCTCP/L4S style ECN signaling.

BBR2 is more "fair" than BBR. In the case of delay and packet loss, its speed will be much slower than BBR, and sometimes slower than the default CUBIC, so you have to test it according to your own link. It can only be regarded as something that is better than nothing.

Windows Server now uses the CUBIC congestion control algorithm by default, which is currently the most commonly used congestion control algorithm.

This article directory

  • 1. Related Links
  • 2. Operation steps
    • 2.1. Windows 11
    • 2.2. Linux
  • 3. Restore settings

Related Links

Google BBR GitHub: https://github.com/google/bbr

Introduction to TCP BBR v2 Alpha/Preview

Steps

Windows 11

The requirement Windows 11is 22h2version and above.

1. Open powershell and right-click to run with administrator privileges

netsh int tcp set supplemental Template=Internet CongestionProvider=bbr2
netsh int tcp set supplemental Template=Datacenter CongestionProvider=bbr2
netsh int tcp set supplemental Template=Compat CongestionProvider=bbr2
netsh int tcp set supplemental Template=DatacenterCustom CongestionProvider=bbr2
netsh int tcp set supplemental Template=InternetCustom CongestionProvider=bbr2

2. Verify whether it is successfully opened

Get-NetTCPSetting | Select SettingName, CongestionProvider
Picture[1]-How to enable TCP BBR2 network congestion control on Windows 11-Ritz Miscellaneous

Linux

View the current congestion control algorithm

sysctl net.ipv4.tcp_congestion_control

If the output is sysctl net.ipv4.tcp_congestion_controlsomething like that, it means our current algorithm is CUBIC and we edited /etc/sysctl.confto change it to BBR.

net.core.default_qdisc=fq
net.ipv4.tcp_congestion_control=bbr

save and apply

sysctl -p

Checking again we will see

net.ipv4.tcp_congestion_control = bbr

The above shows that net.ipv4.tcpit can actually be used in IPv6.

restore settings

Sometimes unexpected problems may occur after opening.

We can enter the following command in powershell to restore.

netsh int tcp set supplemental template=internet congestionprovider=CUBIC
netsh int tcp set supplemental template=internetcustom congestionprovider=CUBIC
netsh int tcp set supplemental template=Datacenter congestionprovider=CUBIC
netsh int tcp set supplemental template=Datacentercustom congestionprovider=CUBIC

Test script bench.sh (for network and IO tests of various Linux distributions)

 After several versions of iterations, the one-key test script bench.sh is almost applicable to the network and IO tests of various Linux distributions.

And display the test results in a more beautiful way.

Summarize the features of bench.sh:

1. Display various system information;
2. Test points taken from Speedtest data centers in many parts of the world, and the network test is relatively comprehensive;
3. Support IPv6 download speed measurement;
4. IO test (write 1GB data sequentially) three times, and display the average value.

Combined with the  unixbench.sh  script test, you can fully test the performance of the VPS.

How to use:
Command 1:

wget -qO-bench.sh | bash

or

curl -Lso-bench.sh | bash

Command 2:

wget -qO-86.re /bench.sh | bash _ _

or

curl -so-86.re /bench.sh | bash _ _

Note:
bench.sh is both the script name and the domain name. So don't suspect that I wrote it wrong or you read it wrong.

Download address:
https://github.com/teddysun/across/blob/master/bench.sh

update log

Updated on February 22, 2022:
1. Added the judgment of whether the CPU supports AES-NI and VM-x/AMD-V;
2. Improved the algorithm for calculating hard disk space;
3. Improved the algorithm for calculating RAM and Swap;
4. Improve the time stamp display method and add time zone display;

Picture [1] - test script bench.sh (applicable to network and IO tests of various Linux distributions) - Rich Miscellaneous


Updated on January 1, 2022:
1. Optimized the script logic and beautified the display;
2. Upgraded the version of speedtest-cli to 1.1.1;
3. Supported arm64 (aarch64) and armv7l (armhf) architectures, as shown in the figure below;
・arm64 (aarch64)

Picture [2] - test script bench.sh (suitable for network and IO tests of various Linux distributions) - Rich Miscellaneous


・armv7l (armhf)

Picture [3] - test script bench.sh (suitable for network and IO tests of various Linux distributions) - Rich Miscellaneous


4. Optimize the speed test server list provided by Speedtest. Among them, the node information of Shanghai, Nanjing and Guangzhou in China is as follows

24447 ) China Unicom 5G ( ShangHai, China )< font >< /font >
26352 ) China Telecom JiangSu 5G ( Nanjing, China )< font >< /font >
27594 ) ChinaTelecom 5G ( Guangzhou, China )

・x86_64

Picture [4] - test script bench.sh (suitable for network and IO tests of various Linux distributions) - Rich Miscellaneous

Updated on July 29, 2020:
1. Modification: Speedtest is provided by Speedtest, which distinguishes between upload and download, which is more practical;
2. Added: TCP Congestion Control; virtualization method; IP information, etc.

The figure below shows my  Vultr  evaluation data:

Picture [5] - test script bench.sh (applicable to network and IO tests of various Linux distributions) - Rich Miscellaneous

Update on January 07, 2018:
Modification: The color of the displayed information is divided into categories to make it easier to distinguish.

The figure below shows the evaluation data of my KS3C 100M dedicated server:

Picture [6] - test script bench.sh (suitable for network and IO tests of various Linux distributions) - Rich Miscellaneous

Updated on November 24, 2016:
Added: display hard disk information; advance the location of the IO speed test, and put the network download speed test at the end.

As follows:

Picture [7] - test script bench.sh (suitable for network and IO tests of various Linux distributions) - Rich Miscellaneous

Put some test pictures at the end.

Bandwagon Host Los Angel

Picture [8] - test script bench.sh (suitable for network and IO testing of various Linux distributions) - Rich Miscellaneous

DigitalOcean Singapore

Picture [9] - test script bench.sh (suitable for network and IO testing of various Linux distributions) - Rich Miscellaneous

Ramnode Seattle

Picture [10] - test script bench.sh (applicable to network and IO tests of various Linux distributions) - Rich Miscellaneous

Xvmlabs Los Angel

Picture [11] - test script bench.sh (suitable for network and IO tests of various Linux distributions) - Rich Miscellaneous

Reprinted: Qiushui Yibing  »  One-click test script bench.sh

Linux performance test UnixBench one-click script

 

Picture [1] - Linux performance test UnixBench one-click script - Rich Miscellaneous

UnixBench is a performance testing tool for unix-like (Unix, BSD, Linux) systems, an open source tool, which is widely used to test the performance of linux system hosts. The main test items of Unixbench include: system call, read and write, process, graphical test, 2D, 3D, pipeline, operation, C library and other system benchmark performance to provide test data.

The latest version of UnixBench5.1.3 includes system and graphic tests. If you need to test graphics, you need to modify the Makefile, do not comment out "GRAPHIC_TESTS = defined", and the system needs to provide the x11perf command gl_glibs library.
The script below uses the latest version of UnixBench 5.1.3 to test, annotated the test items about graphics (most VPSs do not have a graphics card or an integrated display, so there is no need to test the graphics performance), after running for 10-30 minutes (according to the CPU core Quantity, operation time varies) to get a score, the higher the better.

testing method:

wget --no-check-certificate https://github.com/teddysun/across/raw/master/unixbench.sh
chmod +x unixbench.sh
./unixbench.sh _

Test item:
Dhrystone 2 using register variables
This item is used to test string handling, because there is no floating-point operation, it is deeply involved in software and hardware design (hardware and software design), compilation and linking (compiler and linker options), code optimization (code Optimazaton), the impact on memory cache (cache memory), wait states (wait states), and integer data types (integer data types).

Double-Precision Whetstone
This one tests the speed and efficiency of floating-point operations. This test consists of several modules, each of which includes a set of operations for scientific computing. A wide range of c functions: sin, cos, sqrt, exp, log are used for integer and floating-point mathematical operations, array access, conditional branch (conditional branch) and program calls. This test tests both integer and floating-point arithmetic operations.

Execl Throughput
This test examines the number of execl system calls that can be executed per second. The execl system call is a member of the exec family of functions. It is a front end to the execve() function, like some other commands like it.

File copy
tests the rate at which data is transferred from one file to another. Each test uses a different size buffer. This test for file read, write, and copy operations counts the number of file read, write, and copy operations within the specified time (default is 10s).

Pipe Throughput
(pipe) is the simplest way to communicate between processes. Pipe throughput here refers to the number of times a process can write 512 bytes of data to a pipe in one second and then read it back. It should be noted that pipe throughtput has no corresponding real existence in actual programming.

Pipe-based Context Switching
This tests the number of times two processes (per second) exchange an ever-increasing integer through a pipe. This point is very similar to some applications in real programming. This test program first creates a sub-process, and then performs two-way pipeline transmission with this sub-process.

Process Creation
tests the number of times a process can create a child process and then withdraw the child process per second (the child process must exit immediately). The focus of process creation is the creation and memory allocation of the new process process control block (process control block), that is, the focus on memory bandwidth. In general, this test is used to compare different implementations of the operating system process creation system call.

System Call Overhead
tests the cost of entering and leaving the operating system kernel, that is, the cost of a system call. It does this with a small program that repeatedly calls the getpid function.

Shell Scripts
tests the number of times a process can concurrently start n copies of a shell script within one second, and n generally takes the value of 1, 2, 4, and 8. (I take 1, 8 when testing). This script performs a series of transformations on a data file.

The following is the running score of a 512MB, 2-core, OpenVZ VPS of mine:

BYTE UNIX Benchmarks ( Version 5.1.3 ) _
System: vpn: GNU/Linux
OS: GNU/Linux -- 2.6.32-042stab076.8 -- # 1 SMP Tue May 14 20:38:14 MSK 2013
Machine: i686 ( i386 )
Language: en_US.utf8 ( charmap = "UTF-8" , collate= "UTF-8" )
CPU 0 : Intel ( R ) Xeon ( R ) CPU L5520 @ 2.27GHz ( 4533.6 bogomips )
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
CPU 1 : Intel ( R ) Xeon ( R ) CPU L5520 @ 2.27 GHz ( 4533.6 bogomips )
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET, Intel virtualization
09 : 41 : 17 up 31 days, 9 : 21 , 1 user, load average: 0.23 , 0.05 , 0.02 ; runlevel 3
-------------------------------------------------- ----------------------
Benchmark Run: Mon Jul 29 2013 09 : 41 : 17 - 10 : 09 : 29
2 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 17172222.3 lps ( 10.0 s, 7 samples )
Double-Precision Whetstone 2600.2 MWIPS ( 10.0 s, 7 samples )
Execl Throughput 4152.8 lps ( 30.0 s, 2 samples )
File Copy 1024 bufsize 2000 maxblocks 622759.5 KBps ( 30.0 s, 2 samples )
File Copy 256 bufsize 500 maxblocks 172634.3 KBps ( 30.0 s, 2 samples )
File Copy 4096 bufsize 8000 maxblocks 1218236.9 KBps ( 30.0 s, 2 samples )
Pipe Throughput 1416230.5 lps ( 10.0 s, 7 samples )
Pipe-based Context Switching 206509.4 lps ( 10.0 s, 7 samples )
Process Creation 8568.6 lps ( 30.0 s, 2 samples )
Shell Scripts ( 1 concurrent ) 3145.9 lpm ( 60.0 s, 2 samples )
Shell Scripts ( 8 concurrent ) 528.3 lpm ( 60.0 s, 2 samples )
System Call Overhead 1528474.7 lps ( 10.0 s, 7 samples )
System Benchmarks Index Values ​​BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 17172222.3 1471.5
Double-Precision Whetstone 55.0 2600.2 472.8
Execl Throughput 43.0 4152.8 965.8
File Copy 1024 bufsize 2000 maxblocks 3960.0 622759.5 1572.6
File Copy 256 bufsize 500 maxblocks 1655.0 172634.3 1043.1
File Copy 4096 bufsize 8000 maxblocks 5800.0 1218236.9 2100.4
Pipe Throughput 12440.0 1416230.5 1138.4
Pipe-based Context Switching 4000.0 206509.4 516.3
Process Creation 126.0 8568.6 680.0
Shell Scripts ( 1 concurrent ) 42.4 3145.9 742.0
Shell Scripts ( 8 concurrent ) 6.0 528.3 880.5
System Call Overhead 15000.0 1528474.7 1019.0
========
System Benchmarks Index Score 960.4
-------------------------------------------------- ----------------------
Benchmark Run: Mon Jul 29 2013 10 : 09 : 29 - 10 : 39 : 56
2 CPUs in system; running 2 parallel copies of tests
Dhrystone 2 using register variables 16851634.7 lps ( 10.0 s, 7 samples )
Double-Precision Whetstone 5182.9 MWIPS ( 10.0 s, 7 samples )
Execl Throughput 4101.9 lps ( 30.0 s, 2 samples )
File Copy 1024 bufsize 2000 maxblocks 635244.9 KBps ( 30.0 s, 2 samples )
File Copy 256 bufsize 500 maxblocks 174430.2 KBps ( 30.0 s, 2 samples )
File Copy 4096 bufsize 8000 maxblocks 1219982.0 KBps ( 30.0 s, 2 samples )
Pipe Throughput 1387297.9 lps ( 10.0 s, 7 samples )
Pipe-based Context Switching 196296.1 lps ( 10.0 s, 7 samples )
Process Creation 10889.9 lps ( 30.0 s, 2 samples )
Shell Scripts ( 1 concurrent ) 4073.7 lpm ( 60.0 s, 2 samples )
Shell Scripts ( 8 concurrent ) 550.5 lpm ( 60.2 s, 2 samples )
System Call Overhead 1538517.4 lps ( 10.0 s, 7 samples )
System Benchmarks Index Values ​​BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 16851634.7 1444.0
Double-Precision Whetstone 55.0 5182.9 942.3
Execl Throughput 43.0 4101.9 953.9
File Copy 1024 bufsize 2000 maxblocks 3960.0 635244.9 1604.2
File Copy 256 bufsize 500 maxblocks 1655.0 174430.2 1054.0
File Copy 4096 bufsize 8000 maxblocks 5800.0 1219982.0 2103.4
Pipe Throughput 12440.0 1387297.9 1115.2
Pipe-based Context Switching 4000.0 196296.1 490.7
Process Creation 126.0 10889.9 864.3
Shell Scripts ( 1 concurrent ) 42.4 4073.7 960.8
Shell Scripts ( 8 concurrent ) 6.0 550.5 917.5
System Call Overhead 15000.0 1538517.4 1025.7
========
System Benchmarks Index Score 1058.3