Sep 2018¶
Sep 20¶
```c
[ 54.602054] nr_pcache_pee_free: 0
[ 54.602537] nr_pcache_pee_free_kmalloc: 0
[ 1468.765410] mlx4_msi_x_interrupt(): IRQ: 27 CPU: 1
[ 1468.766956] event PORT_MNG_CHG arrived
[ 1468.768193]
[ 1468.813660] ib_cache: ib_cache_update(): Updated port 1 of dev 0000:00:08.0
[ 1468.815097] ib_sa_event(): TODO
[ 1479.178651] mlx4_msi_x_interrupt(): IRQ: 27 CPU: 1
[ 1479.180201] event PORT_MNG_CHG arrived
[ 1479.181430]
```
Sep 17¶
Can not believe I’m wasting time on this crap X again.
Sep 16¶
Tests done today:
Setting | Log | nr_workers | Tracing (strace/counter/profiling) | Runtime (s) | pcache_flush_net (us) |
---|---|---|---|---|---|
TF-MNIST, Linux | 13.2s | ||||
TF4-MNIST, 128MB | 0916-w14-1 | 1 | ON | avg 48.5s | 9891 |
TF4-MNIST, 128MB | 0916-w14-2 | 1 | OFF | (46.1+44.6+45.5+45.7+44)/5 = 45.2s | N/A |
TF4-MNIST, 128MB | 0916-w14-4 | 4 | ON | (43.4+44+43.9+42.6+42.1)/5=43.2 | 8351 |
TF4-MNIST, 128MB | 0916-w14-3 | 4 | OFF | (40.1+42.1+42.0+41.7+42.1)/5 = 41.6 | N/A |
TF4-Cifar, Linux | 235.5s | ||||
TF4-Cifar, 128MB | 0916-w14-5 | 4 | OFF | (636.2+635.0+636.8+637.2+634.1)/5=635.8 | N/A |
TF4-Cifar, 128MB | 0916-w14-6 | 1 | OFF | (660.2+662.2+662.8+663.8+661+5)/5=663s | N/A |
TF4-Cifar, 256MB | 0916-w14-7 | 1 | OFF | 486s | N/A |
Sep 15¶
DAMN.
Let us summarize today. Okay. Fixed the double-post-cqe issue. Hehe. The post part is the only fucking left code that I did not look into at fit_poll_recv_cq. And, ironically, there is no error checking for ib_post_recv(), which won’t generate any error/warning.
error checking error checking…
Anyway fuck it.
Today I created a new tag v0.0.9, hope we have a stable net. The RPC profile code is very stressing, and fit survived.
The following wanring is fixed by post rx_depth/2.
c
[ 1812.017204] fit: To align first QPN, we skipped: #72 #72 #73 #74 #75 #76 #77 #78 #79
[ 1812.157570] fit: fit_post_receives_message()-628 CPU 2 Fail to post recv conn_id: 12
[ 1812.166013] ------------[ cut here ]------------
[ 1812.171152] WARNING: CPU: 2 PID: 16 at net/lego/fit_internal.c:629 fit_post_receives_message.isra.7+0xce/0x100
[ 1812.182302] CPU: 2 PID: 16 Comm: ib-initd 4.0.0-lego+ #95
[ 1812.188314] Stack:
[ 1812.190544] ffff880ff98bfd50 ffffffff8101299b 0000000000000cff 0000000000000060
[ 1812.198689] 0000000000000d00 0000000000000100 ffff880ff98dc030 ffff880ff98bfd60
[ 1812.206834] ffffffff81012a8f ffff880ff98bfdc8 ffffffff810743de fffffff4fffffff4
[ 1812.214978] ffff880ff98bfd80 0000000000000000 0000000000000cff 0000000000000000
[ 1812.223124] 0000000000000000 ffff880ff98dc000 0000000000000000 000000000000000c
[ 1812.231269] Call Trace:
[ 1812.233984] <TSK>
[ 1812.236116] [<ffffffff810129a7>] __warn.constprop.0+0xa7/0x100
[ 1812.242613] [<ffffffff81012a8f>] warn_slowpath_null+0xf/0x20
[ 1812.248915] [<ffffffff810743de>] fit_post_receives_message.isra.7+0xce/0x100
[ 1812.256770] [<ffffffff81076a1a>] fit_add_newnode+0xca/0x170
[ 1812.262974] [<ffffffff81079d10>] fit_establish_conn+0x7b0/0xaa0
[ 1812.269568] [<ffffffff81073ce8>] ? ibv_add_one+0x98/0x120
[ 1812.275580] [<ffffffff810741f0>] ? ibapi_get_node_id+0x20/0x20
[ 1812.282076] [<ffffffff81074258>] lego_ib_init+0x68/0xf0
[ 1812.287893] [<ffffffff81023261>] kthread+0x111/0x130
[ 1812.293421] [<ffffffff81023150>] ? __kthread_parkme+0x70/0x70
[ 1812.299820] [<ffffffff8100eaf2>] ret_from_fork+0x22/0x30
[ 1812.305735] <EOT>
[ 1812.307868] ---[ end trace 0000000000000000 ]---
Sep 11¶
Got this log, 5 machine, p2s_open, S side has this issue. Damn.
```c [ 1672.962279]
***** Fail to to get the CQE from send_cq after 20 seconds!
***** This means the packet was lost and something went wrong
***** with your NIC…
***** connection_id: 11 dest node: 0
[ 1673.061668] ------------[ cut here ]------------
[ 1673.074937] WARNING: CPU: 10 PID: 4624 at /root/ys/LegoOS_2M/linux-modules/fit/fit_internal.c:956 fit_internal_poll_sendcq+0xda/0x130 fit
[ 1673.101557] Modules linked in: storage(OF) fit(OF) xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT tun bridge stp llc ebtable_filter ebtable
s ip6table_filter ip6_tables iptable_filter xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_c
m iw_cm ib_addr x86_pkg_temp_thermal coretemp kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ipmi_devintf ablk_helper cryptd ipmi_si iTCO_wdt ipmi_msghandler iTCO_vendor_support dcdbas sg pcspkr shpchp
acpi_power_meter lpc_ich mfd_core wmi mperf uinput binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_ib
[ 1673.182609] ib_sa ib_mad ib_core mlx4_en sd_mod crc_t10dif mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm drm ahci crc32c_intel libahci mlx4_core libata tg3 nvme megaraid_sas ptp i2c_core pps_core dm_mirror
dm_region_hash dm_log dm_mod
[ 1673.222604] CPU: 10 PID: 4624 Comm: lego-storaged Tainted: GF W O 3.11.1-vanilla #1
[ 1673.235825] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.5.4 10/002/2015
[ 1673.248883] 0000000000000009 ffff88102186b9f8 ffffffff8159a5a4 0000000000000000
[ 1673.261795] ffff88102186ba30 ffffffff810641bd ffff882027180400 00000004a817c800
[ 1673.274499] 00000180dc3abde5 0000000000000000 0000000000000000 ffff88102186ba40
[ 1673.287034] Call Trace:
[ 1673.299259] [
Sep 08¶
Check this log out: ``` ]— [ 427.218569] STDOUT: —[ INFO:tensorflow:Graph was finalized.
]—
[ 427.416043] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 427.424583] IP: [
```
Trying to tune FIT’s number polling threads. This could be the throughput/latency killer.
128M
P num_polling | M worker | M num_polling | Runtime (s) |
---|---|---|---|
1 | 1 | 1 | 46.8s |
1 | 4 | 1 |
Sep 07¶
Set up Infiniswap again. What a fucking crap code, and crash the kernel out of nowhere. crap crap crap.
Hmm, Linux will tune the CPU freq during runtime, will be higher than 2.4GHz. So disable it, make it a fair comparison with Lego.
intel_pstate=disable.
Sep 06¶
Did two optimizations on pcache, both are buffer management. Especially the pcache rmap case. In both opts, we kind of use static/pre-allocated array to serve dynamic allocation.
This is a better solution than using kmem_cache, faster. kmem_cache will be a more general solution here.
kmem_cache, FIFO queue (thpool buffer), static preallocated array (rmap, clflush)… Buffer management is really a very important thing in system building. I should be aware at the beginning next time.
These changes are in commits:
6e0cf6c5c64edbe445a27cf55f86ac51f8a897b3
73377cafce95ffa0cfb155f77cac97456a5e4a71
Sep 05¶
Alright. Besides some flaws/bugs in some kfree stuff, LegoOS now actually is very robust! Ran a quick git summary:
project : LegoOS
repo age : 1 year, 11 months
active : 358 days
commits : 1540
files : 1161
authors :
1317 Yizhou Shan 85.5%
120 root 7.8%
36 hythzz 2.3%
27 yilun 1.8%
16 Yutong Huang 1.0%
10 Build Android 0.6%
8 Yiying Zhang 0.5%
4 sumukh1991 0.3%
1 Yizhou SHan 0.1%
1 Sumukh Hallymysore Ravindra 0.1%
Of course, there are still PLENY room for improvement, and I know where. At this time, I really think we need something like kmem_cache, which is so fucking useful. It can probably further reduce much overhead.
Sep 04¶
Trying the perset eviction list mechanism, instead of victim cache. The benefit of using this is: we will no longer be bottelnecked by victim cache anymore. Each faulting thread will do eviction/flush within its own context.
For 4 threads MNIST, I saw 3 seconds reduction.
Removed the bitmap, use per pcache set counter for quick reference.
Sep 03¶
With DEBUG_MM, try enable HAVE_FREE directory by directory
-¶
-
update_wall_time+0x44 is where we call tsc_read. And this has been called many times (HZ per second). All of a sudden, the pointer got crashed. Who wrote to this code memory?? Remote RDMA?
```c
[ 1052.470714] general protection fault: 0000 [#1] SMP PROCESSOR
[ 1052.477113] CPU: 0 PID: 15 Comm: ib_mad1 4.0.0-lego+ #509
[ 1052.483125] RIP: 0010:[
```