{"api_version":"1","generated_at":"2026-05-12T18:11:34+00:00","cve":"CVE-2026-43420","urls":{"html":"https://cve.report/CVE-2026-43420","api":"https://cve.report/api/cve/CVE-2026-43420.json","docs":"https://cve.report/api","cve_org":"https://www.cve.org/CVERecord?id=CVE-2026-43420","nvd":"https://nvd.nist.gov/vuln/detail/CVE-2026-43420"},"summary":{"title":"ceph: fix i_nlink underrun during async unlink","description":"In the Linux kernel, the following vulnerability has been resolved:\n\nceph: fix i_nlink underrun during async unlink\n\nDuring async unlink, we drop the `i_nlink` counter before we receive\nthe completion (that will eventually update the `i_nlink`) because \"we\nassume that the unlink will succeed\".  That is not a bad idea, but it\nraces against deletions by other clients (or against the completion of\nour own unlink) and can lead to an underrun which emits a WARNING like\nthis one:\n\n WARNING: CPU: 85 PID: 25093 at fs/inode.c:407 drop_nlink+0x50/0x68\n Modules linked in:\n CPU: 85 UID: 3221252029 PID: 25093 Comm: php-cgi8.1 Not tainted 6.14.11-cm4all1-ampere #655\n Hardware name: Supermicro ARS-110M-NR/R12SPD-A, BIOS 1.1b 10/17/2023\n pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)\n pc : drop_nlink+0x50/0x68\n lr : ceph_unlink+0x6c4/0x720\n sp : ffff80012173bc90\n x29: ffff80012173bc90 x28: ffff086d0a45aaf8 x27: ffff0871d0eb5680\n x26: ffff087f2a64a718 x25: 0000020000000180 x24: 0000000061c88647\n x23: 0000000000000002 x22: ffff07ff9236d800 x21: 0000000000001203\n x20: ffff07ff9237b000 x19: ffff088b8296afc0 x18: 00000000f3c93365\n x17: 0000000000070000 x16: ffff08faffcbdfe8 x15: ffff08faffcbdfec\n x14: 0000000000000000 x13: 45445f65645f3037 x12: 34385f6369706f74\n x11: 0000a2653104bb20 x10: ffffd85f26d73290 x9 : ffffd85f25664f94\n x8 : 00000000000000c0 x7 : 0000000000000000 x6 : 0000000000000002\n x5 : 0000000000000081 x4 : 0000000000000481 x3 : 0000000000000000\n x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff08727d3f91e8\n Call trace:\n  drop_nlink+0x50/0x68 (P)\n  vfs_unlink+0xb0/0x2e8\n  do_unlinkat+0x204/0x288\n  __arm64_sys_unlinkat+0x3c/0x80\n  invoke_syscall.constprop.0+0x54/0xe8\n  do_el0_svc+0xa4/0xc8\n  el0_svc+0x18/0x58\n  el0t_64_sync_handler+0x104/0x130\n  el0t_64_sync+0x154/0x158\n\nIn ceph_unlink(), a call to ceph_mdsc_submit_request() submits the\nCEPH_MDS_OP_UNLINK to the MDS, but does not wait for completion.\n\nMeanwhile, between this call and the following drop_nlink() call, a\nworker thread may process a CEPH_CAP_OP_IMPORT, CEPH_CAP_OP_GRANT or\njust a CEPH_MSG_CLIENT_REPLY (the latter of which could be our own\ncompletion).  These will lead to a set_nlink() call, updating the\n`i_nlink` counter to the value received from the MDS.  If that new\n`i_nlink` value happens to be zero, it is illegal to decrement it\nfurther.  But that is exactly what ceph_unlink() will do then.\n\nThe WARNING can be reproduced this way:\n\n1. Force async unlink; only the async code path is affected.  Having\n   no real clue about Ceph internals, I was unable to find out why the\n   MDS wouldn't give me the \"Fxr\" capabilities, so I patched\n   get_caps_for_async_unlink() to always succeed.\n\n   (Note that the WARNING dump above was found on an unpatched kernel,\n   without this kludge - this is not a theoretical bug.)\n\n2. Add a sleep call after ceph_mdsc_submit_request() so the unlink\n   completion gets handled by a worker thread before drop_nlink() is\n   called.  This guarantees that the `i_nlink` is already zero before\n   drop_nlink() runs.\n\nThe solution is to skip the counter decrement when it is already zero,\nbut doing so without a lock is still racy (TOCTOU).  Since\nceph_fill_inode() and handle_cap_grant() both hold the\n`ceph_inode_info.i_ceph_lock` spinlock while set_nlink() runs, this\nseems like the proper lock to protect the `i_nlink` updates.\n\nI found prior art in NFS and SMB (using `inode.i_lock`) and AFS (using\n`afs_vnode.cb_lock`).  All three have the zero check as well.","state":"PUBLISHED","assigner":"Linux","published_at":"2026-05-08 15:16:54","updated_at":"2026-05-12 14:10:27"},"problem_types":[],"metrics":[],"references":[{"url":"https://git.kernel.org/stable/c/6d5fd8bb574bef039eb3b738e523870433a2aeb9","name":"https://git.kernel.org/stable/c/6d5fd8bb574bef039eb3b738e523870433a2aeb9","refsource":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","tags":[],"title":"","mime":"","httpstatus":"","archivestatus":"0"},{"url":"https://git.kernel.org/stable/c/aedd29386b23f3e1e6818943e11abfff2953732f","name":"https://git.kernel.org/stable/c/aedd29386b23f3e1e6818943e11abfff2953732f","refsource":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","tags":[],"title":"","mime":"","httpstatus":"","archivestatus":"0"},{"url":"https://git.kernel.org/stable/c/ce0123cbb4a40a2f1bbb815f292b26e96088639f","name":"https://git.kernel.org/stable/c/ce0123cbb4a40a2f1bbb815f292b26e96088639f","refsource":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","tags":[],"title":"","mime":"","httpstatus":"","archivestatus":"0"},{"url":"https://git.kernel.org/stable/c/7db008e85a5d17b64bc5390b828bf457ae91a415","name":"https://git.kernel.org/stable/c/7db008e85a5d17b64bc5390b828bf457ae91a415","refsource":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","tags":[],"title":"","mime":"","httpstatus":"","archivestatus":"0"},{"url":"https://git.kernel.org/stable/c/b3f5513141ecc6b277a8f7b7efe58a0cf9a5e859","name":"https://git.kernel.org/stable/c/b3f5513141ecc6b277a8f7b7efe58a0cf9a5e859","refsource":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","tags":[],"title":"","mime":"","httpstatus":"","archivestatus":"0"},{"url":"https://git.kernel.org/stable/c/8975b85b0d45ca811ace6fac5907652f2310e5ac","name":"https://git.kernel.org/stable/c/8975b85b0d45ca811ace6fac5907652f2310e5ac","refsource":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","tags":[],"title":"","mime":"","httpstatus":"","archivestatus":"0"},{"url":"https://git.kernel.org/stable/c/fcc477a6e8856c8a42b3c9e171724d8d6dfadd06","name":"https://git.kernel.org/stable/c/fcc477a6e8856c8a42b3c9e171724d8d6dfadd06","refsource":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","tags":[],"title":"","mime":"","httpstatus":"","archivestatus":"0"},{"url":"https://git.kernel.org/stable/c/9b31e88ac5623d15c8bc46f69dfe1d3b43a8f67c","name":"https://git.kernel.org/stable/c/9b31e88ac5623d15c8bc46f69dfe1d3b43a8f67c","refsource":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","tags":[],"title":"","mime":"","httpstatus":"","archivestatus":"0"},{"url":"https://www.cve.org/CVERecord?id=CVE-2026-43420","name":"CVE Program record","refsource":"CVE.ORG","tags":["canonical"]},{"url":"https://nvd.nist.gov/vuln/detail/CVE-2026-43420","name":"NVD vulnerability detail","refsource":"NVD","tags":["canonical","analysis"]}],"affected":[{"source":"CNA","vendor":"Linux","product":"Linux","version":"affected 2ccb45462aeaf0831397b90d31d3d50a7704fa1f 9b31e88ac5623d15c8bc46f69dfe1d3b43a8f67c git","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"affected 2ccb45462aeaf0831397b90d31d3d50a7704fa1f 6d5fd8bb574bef039eb3b738e523870433a2aeb9 git","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"affected 2ccb45462aeaf0831397b90d31d3d50a7704fa1f fcc477a6e8856c8a42b3c9e171724d8d6dfadd06 git","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"affected 2ccb45462aeaf0831397b90d31d3d50a7704fa1f b3f5513141ecc6b277a8f7b7efe58a0cf9a5e859 git","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"affected 2ccb45462aeaf0831397b90d31d3d50a7704fa1f aedd29386b23f3e1e6818943e11abfff2953732f git","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"affected 2ccb45462aeaf0831397b90d31d3d50a7704fa1f 7db008e85a5d17b64bc5390b828bf457ae91a415 git","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"affected 2ccb45462aeaf0831397b90d31d3d50a7704fa1f 8975b85b0d45ca811ace6fac5907652f2310e5ac git","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"affected 2ccb45462aeaf0831397b90d31d3d50a7704fa1f ce0123cbb4a40a2f1bbb815f292b26e96088639f git","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"affected 5.7","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"unaffected 5.7 semver","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"unaffected 5.10.253 5.10.* semver","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"unaffected 5.15.203 5.15.* semver","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"unaffected 6.1.167 6.1.* semver","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"unaffected 6.6.130 6.6.* semver","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"unaffected 6.12.78 6.12.* semver","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"unaffected 6.18.19 6.18.* semver","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"unaffected 6.19.9 6.19.* semver","platforms":[]},{"source":"CNA","vendor":"Linux","product":"Linux","version":"unaffected 7.0 * original_commit_for_fix","platforms":[]}],"timeline":[],"solutions":[],"workarounds":[],"exploits":[],"credits":[],"nvd_cpes":[],"vendor_comments":[],"enrichments":{"kev":null,"epss":{"cve_year":"2026","cve_id":"43420","cve":"CVE-2026-43420","epss":"0.000240000","percentile":"0.070210000","score_date":"2026-05-11","updated_at":"2026-05-12 00:01:17"},"legacy_qids":[]},"source_records":{"cve_program":{"containers":{"cna":{"affected":[{"defaultStatus":"unaffected","product":"Linux","programFiles":["fs/ceph/dir.c"],"repo":"https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git","vendor":"Linux","versions":[{"lessThan":"9b31e88ac5623d15c8bc46f69dfe1d3b43a8f67c","status":"affected","version":"2ccb45462aeaf0831397b90d31d3d50a7704fa1f","versionType":"git"},{"lessThan":"6d5fd8bb574bef039eb3b738e523870433a2aeb9","status":"affected","version":"2ccb45462aeaf0831397b90d31d3d50a7704fa1f","versionType":"git"},{"lessThan":"fcc477a6e8856c8a42b3c9e171724d8d6dfadd06","status":"affected","version":"2ccb45462aeaf0831397b90d31d3d50a7704fa1f","versionType":"git"},{"lessThan":"b3f5513141ecc6b277a8f7b7efe58a0cf9a5e859","status":"affected","version":"2ccb45462aeaf0831397b90d31d3d50a7704fa1f","versionType":"git"},{"lessThan":"aedd29386b23f3e1e6818943e11abfff2953732f","status":"affected","version":"2ccb45462aeaf0831397b90d31d3d50a7704fa1f","versionType":"git"},{"lessThan":"7db008e85a5d17b64bc5390b828bf457ae91a415","status":"affected","version":"2ccb45462aeaf0831397b90d31d3d50a7704fa1f","versionType":"git"},{"lessThan":"8975b85b0d45ca811ace6fac5907652f2310e5ac","status":"affected","version":"2ccb45462aeaf0831397b90d31d3d50a7704fa1f","versionType":"git"},{"lessThan":"ce0123cbb4a40a2f1bbb815f292b26e96088639f","status":"affected","version":"2ccb45462aeaf0831397b90d31d3d50a7704fa1f","versionType":"git"}]},{"defaultStatus":"affected","product":"Linux","programFiles":["fs/ceph/dir.c"],"repo":"https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git","vendor":"Linux","versions":[{"status":"affected","version":"5.7"},{"lessThan":"5.7","status":"unaffected","version":"0","versionType":"semver"},{"lessThanOrEqual":"5.10.*","status":"unaffected","version":"5.10.253","versionType":"semver"},{"lessThanOrEqual":"5.15.*","status":"unaffected","version":"5.15.203","versionType":"semver"},{"lessThanOrEqual":"6.1.*","status":"unaffected","version":"6.1.167","versionType":"semver"},{"lessThanOrEqual":"6.6.*","status":"unaffected","version":"6.6.130","versionType":"semver"},{"lessThanOrEqual":"6.12.*","status":"unaffected","version":"6.12.78","versionType":"semver"},{"lessThanOrEqual":"6.18.*","status":"unaffected","version":"6.18.19","versionType":"semver"},{"lessThanOrEqual":"6.19.*","status":"unaffected","version":"6.19.9","versionType":"semver"},{"lessThanOrEqual":"*","status":"unaffected","version":"7.0","versionType":"original_commit_for_fix"}]}],"cpeApplicability":[{"nodes":[{"cpeMatch":[{"criteria":"cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*","versionEndExcluding":"5.10.253","versionStartIncluding":"5.7","vulnerable":true},{"criteria":"cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*","versionEndExcluding":"5.15.203","versionStartIncluding":"5.7","vulnerable":true},{"criteria":"cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*","versionEndExcluding":"6.1.167","versionStartIncluding":"5.7","vulnerable":true},{"criteria":"cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*","versionEndExcluding":"6.6.130","versionStartIncluding":"5.7","vulnerable":true},{"criteria":"cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*","versionEndExcluding":"6.12.78","versionStartIncluding":"5.7","vulnerable":true},{"criteria":"cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*","versionEndExcluding":"6.18.19","versionStartIncluding":"5.7","vulnerable":true},{"criteria":"cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*","versionEndExcluding":"6.19.9","versionStartIncluding":"5.7","vulnerable":true},{"criteria":"cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*","versionEndExcluding":"7.0","versionStartIncluding":"5.7","vulnerable":true}],"negate":false,"operator":"OR"}]}],"descriptions":[{"lang":"en","value":"In the Linux kernel, the following vulnerability has been resolved:\n\nceph: fix i_nlink underrun during async unlink\n\nDuring async unlink, we drop the `i_nlink` counter before we receive\nthe completion (that will eventually update the `i_nlink`) because \"we\nassume that the unlink will succeed\".  That is not a bad idea, but it\nraces against deletions by other clients (or against the completion of\nour own unlink) and can lead to an underrun which emits a WARNING like\nthis one:\n\n WARNING: CPU: 85 PID: 25093 at fs/inode.c:407 drop_nlink+0x50/0x68\n Modules linked in:\n CPU: 85 UID: 3221252029 PID: 25093 Comm: php-cgi8.1 Not tainted 6.14.11-cm4all1-ampere #655\n Hardware name: Supermicro ARS-110M-NR/R12SPD-A, BIOS 1.1b 10/17/2023\n pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)\n pc : drop_nlink+0x50/0x68\n lr : ceph_unlink+0x6c4/0x720\n sp : ffff80012173bc90\n x29: ffff80012173bc90 x28: ffff086d0a45aaf8 x27: ffff0871d0eb5680\n x26: ffff087f2a64a718 x25: 0000020000000180 x24: 0000000061c88647\n x23: 0000000000000002 x22: ffff07ff9236d800 x21: 0000000000001203\n x20: ffff07ff9237b000 x19: ffff088b8296afc0 x18: 00000000f3c93365\n x17: 0000000000070000 x16: ffff08faffcbdfe8 x15: ffff08faffcbdfec\n x14: 0000000000000000 x13: 45445f65645f3037 x12: 34385f6369706f74\n x11: 0000a2653104bb20 x10: ffffd85f26d73290 x9 : ffffd85f25664f94\n x8 : 00000000000000c0 x7 : 0000000000000000 x6 : 0000000000000002\n x5 : 0000000000000081 x4 : 0000000000000481 x3 : 0000000000000000\n x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff08727d3f91e8\n Call trace:\n  drop_nlink+0x50/0x68 (P)\n  vfs_unlink+0xb0/0x2e8\n  do_unlinkat+0x204/0x288\n  __arm64_sys_unlinkat+0x3c/0x80\n  invoke_syscall.constprop.0+0x54/0xe8\n  do_el0_svc+0xa4/0xc8\n  el0_svc+0x18/0x58\n  el0t_64_sync_handler+0x104/0x130\n  el0t_64_sync+0x154/0x158\n\nIn ceph_unlink(), a call to ceph_mdsc_submit_request() submits the\nCEPH_MDS_OP_UNLINK to the MDS, but does not wait for completion.\n\nMeanwhile, between this call and the following drop_nlink() call, a\nworker thread may process a CEPH_CAP_OP_IMPORT, CEPH_CAP_OP_GRANT or\njust a CEPH_MSG_CLIENT_REPLY (the latter of which could be our own\ncompletion).  These will lead to a set_nlink() call, updating the\n`i_nlink` counter to the value received from the MDS.  If that new\n`i_nlink` value happens to be zero, it is illegal to decrement it\nfurther.  But that is exactly what ceph_unlink() will do then.\n\nThe WARNING can be reproduced this way:\n\n1. Force async unlink; only the async code path is affected.  Having\n   no real clue about Ceph internals, I was unable to find out why the\n   MDS wouldn't give me the \"Fxr\" capabilities, so I patched\n   get_caps_for_async_unlink() to always succeed.\n\n   (Note that the WARNING dump above was found on an unpatched kernel,\n   without this kludge - this is not a theoretical bug.)\n\n2. Add a sleep call after ceph_mdsc_submit_request() so the unlink\n   completion gets handled by a worker thread before drop_nlink() is\n   called.  This guarantees that the `i_nlink` is already zero before\n   drop_nlink() runs.\n\nThe solution is to skip the counter decrement when it is already zero,\nbut doing so without a lock is still racy (TOCTOU).  Since\nceph_fill_inode() and handle_cap_grant() both hold the\n`ceph_inode_info.i_ceph_lock` spinlock while set_nlink() runs, this\nseems like the proper lock to protect the `i_nlink` updates.\n\nI found prior art in NFS and SMB (using `inode.i_lock`) and AFS (using\n`afs_vnode.cb_lock`).  All three have the zero check as well."}],"providerMetadata":{"dateUpdated":"2026-05-11T22:24:14.623Z","orgId":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","shortName":"Linux"},"references":[{"url":"https://git.kernel.org/stable/c/9b31e88ac5623d15c8bc46f69dfe1d3b43a8f67c"},{"url":"https://git.kernel.org/stable/c/6d5fd8bb574bef039eb3b738e523870433a2aeb9"},{"url":"https://git.kernel.org/stable/c/fcc477a6e8856c8a42b3c9e171724d8d6dfadd06"},{"url":"https://git.kernel.org/stable/c/b3f5513141ecc6b277a8f7b7efe58a0cf9a5e859"},{"url":"https://git.kernel.org/stable/c/aedd29386b23f3e1e6818943e11abfff2953732f"},{"url":"https://git.kernel.org/stable/c/7db008e85a5d17b64bc5390b828bf457ae91a415"},{"url":"https://git.kernel.org/stable/c/8975b85b0d45ca811ace6fac5907652f2310e5ac"},{"url":"https://git.kernel.org/stable/c/ce0123cbb4a40a2f1bbb815f292b26e96088639f"}],"title":"ceph: fix i_nlink underrun during async unlink","x_generator":{"engine":"bippy-1.2.0"}}},"cveMetadata":{"assignerOrgId":"416baaa9-dc9f-4396-8d5f-8c081fb06d67","assignerShortName":"Linux","cveId":"CVE-2026-43420","datePublished":"2026-05-08T14:21:55.717Z","dateReserved":"2026-05-01T14:12:56.008Z","dateUpdated":"2026-05-11T22:24:14.623Z","state":"PUBLISHED"},"dataType":"CVE_RECORD","dataVersion":"5.2"},"nvd":{"publishedDate":"2026-05-08 15:16:54","lastModifiedDate":"2026-05-12 14:10:27","problem_types":[],"metrics":[],"configurations":[]},"legacy_mitre":{"record":{"CveYear":"2026","CveId":"43420","Ordinal":"1","Title":"ceph: fix i_nlink underrun during async unlink","CVE":"CVE-2026-43420","Year":"2026"},"notes":[{"CveYear":"2026","CveId":"43420","Ordinal":"1","NoteData":"In the Linux kernel, the following vulnerability has been resolved:\n\nceph: fix i_nlink underrun during async unlink\n\nDuring async unlink, we drop the `i_nlink` counter before we receive\nthe completion (that will eventually update the `i_nlink`) because \"we\nassume that the unlink will succeed\".  That is not a bad idea, but it\nraces against deletions by other clients (or against the completion of\nour own unlink) and can lead to an underrun which emits a WARNING like\nthis one:\n\n WARNING: CPU: 85 PID: 25093 at fs/inode.c:407 drop_nlink+0x50/0x68\n Modules linked in:\n CPU: 85 UID: 3221252029 PID: 25093 Comm: php-cgi8.1 Not tainted 6.14.11-cm4all1-ampere #655\n Hardware name: Supermicro ARS-110M-NR/R12SPD-A, BIOS 1.1b 10/17/2023\n pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)\n pc : drop_nlink+0x50/0x68\n lr : ceph_unlink+0x6c4/0x720\n sp : ffff80012173bc90\n x29: ffff80012173bc90 x28: ffff086d0a45aaf8 x27: ffff0871d0eb5680\n x26: ffff087f2a64a718 x25: 0000020000000180 x24: 0000000061c88647\n x23: 0000000000000002 x22: ffff07ff9236d800 x21: 0000000000001203\n x20: ffff07ff9237b000 x19: ffff088b8296afc0 x18: 00000000f3c93365\n x17: 0000000000070000 x16: ffff08faffcbdfe8 x15: ffff08faffcbdfec\n x14: 0000000000000000 x13: 45445f65645f3037 x12: 34385f6369706f74\n x11: 0000a2653104bb20 x10: ffffd85f26d73290 x9 : ffffd85f25664f94\n x8 : 00000000000000c0 x7 : 0000000000000000 x6 : 0000000000000002\n x5 : 0000000000000081 x4 : 0000000000000481 x3 : 0000000000000000\n x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff08727d3f91e8\n Call trace:\n  drop_nlink+0x50/0x68 (P)\n  vfs_unlink+0xb0/0x2e8\n  do_unlinkat+0x204/0x288\n  __arm64_sys_unlinkat+0x3c/0x80\n  invoke_syscall.constprop.0+0x54/0xe8\n  do_el0_svc+0xa4/0xc8\n  el0_svc+0x18/0x58\n  el0t_64_sync_handler+0x104/0x130\n  el0t_64_sync+0x154/0x158\n\nIn ceph_unlink(), a call to ceph_mdsc_submit_request() submits the\nCEPH_MDS_OP_UNLINK to the MDS, but does not wait for completion.\n\nMeanwhile, between this call and the following drop_nlink() call, a\nworker thread may process a CEPH_CAP_OP_IMPORT, CEPH_CAP_OP_GRANT or\njust a CEPH_MSG_CLIENT_REPLY (the latter of which could be our own\ncompletion).  These will lead to a set_nlink() call, updating the\n`i_nlink` counter to the value received from the MDS.  If that new\n`i_nlink` value happens to be zero, it is illegal to decrement it\nfurther.  But that is exactly what ceph_unlink() will do then.\n\nThe WARNING can be reproduced this way:\n\n1. Force async unlink; only the async code path is affected.  Having\n   no real clue about Ceph internals, I was unable to find out why the\n   MDS wouldn't give me the \"Fxr\" capabilities, so I patched\n   get_caps_for_async_unlink() to always succeed.\n\n   (Note that the WARNING dump above was found on an unpatched kernel,\n   without this kludge - this is not a theoretical bug.)\n\n2. Add a sleep call after ceph_mdsc_submit_request() so the unlink\n   completion gets handled by a worker thread before drop_nlink() is\n   called.  This guarantees that the `i_nlink` is already zero before\n   drop_nlink() runs.\n\nThe solution is to skip the counter decrement when it is already zero,\nbut doing so without a lock is still racy (TOCTOU).  Since\nceph_fill_inode() and handle_cap_grant() both hold the\n`ceph_inode_info.i_ceph_lock` spinlock while set_nlink() runs, this\nseems like the proper lock to protect the `i_nlink` updates.\n\nI found prior art in NFS and SMB (using `inode.i_lock`) and AFS (using\n`afs_vnode.cb_lock`).  All three have the zero check as well.","Type":"Description","Title":"ceph: fix i_nlink underrun during async unlink"}]}}}