製品・ソフトウェアに関する情報

脆弱性検索

ベンダー検索

プロダクト・サービス検索

JVN脆弱性検索トップ

->

JVN脆弱性詳細

LinuxのLinux Kernelにおけるリソースのロックに関する脆弱性

タイトル	LinuxのLinux Kernelにおけるリソースのロックに関する脆弱性
概要	Linuxカーネルにおいて、次の脆弱性が修正されました。cgroupのrmdir時に発生していたcss percpu_refのキル処理を、cgroupが空になるまで延期するようにしました。v7.0以降の一連のコミットでrmdir処理を再設計し、サブシステムの-css_offline()がcgroup内でカーネル側の作業がまだ行われている間は実行されないというコントローラーの不変条件を満たすように対応しました。[1] d245698d727aではtask csetのunlink処理をdo_exit()からfinish_task_switch()に移動し、タスクのスケジューリング停止後にリンクが解除されるようにしました。これにより、exit_signals()後のタスクが最終コンテキストスイッチまでcset-tasksに残留し、ユーザ空間の期待とカーネル側の待機条件に乖離が生じました。[2]-[5]ではこの乖離を調整しています。[2]では終了中のタスクをcgroup.procsから除外し、[3]ではrmdir(2)をTASK_UNINTERRUPTIBLE状態で待機させ、[4]で待機条件を修正し、[5]でnr_dying_subsys_*を同期的に可視化しました。しかし、[3]のcgroup_drain_dying()の待機は根本的な問題解決にはならず、rmdir呼び出し元がゾンビの再逐次処理者である場合にデッドロックが発生しました。そのため、cssのキル処理側は非同期で実行されるべきであり、-css_offline()は既にpercpu_ref_kill_and_confirm()により非同期でcss_killed_work_fn()から実行されています。修正内容は、すべてのタスクがcgroupを離れるまでこの処理を開始しないようにすることです。rmdirのユーザから見える側はcgroup.procs等が空になるとすぐに戻りますが、-css_offline()はcgroupが完全に空になるまで実行されません。元の再現テスト（pidnsの解放およびゾンビ再逐次処理者の挙動確認）が成功し、コミットごとの決定論的再現テストも行われています。cgroup_apply_control_disable()に存在した既存の競合も同様の形で、kill_css()は同期的に実行され、exit_signals()後のタスクがまだcsetにリンクされたまま-css_offline()が先行することがありました。本パッチでは同期的動作を保持しつつ、後続のパッチでkill_css_finish()の遅延実行を行います。この方法は適切であり、大きな問題は見当たりません。変更はやや侵襲的ですが過度ではなく、安定版へのバックポートが可能です。問題があれば、一連のコミット[1]-[5]を元に戻して開発ブランチで再検討します。v2では遅延破棄処理の周囲に明示的なcgroup_get()/cgroup_put()を追加して参照を固定しました。v1は本質的に破綻していませんでしたが、この明示的参照により非明示の不変条件への依存が排除されています。以上のように、cgroupのrmdir処理におけるタスク終了管理とcssキル処理を改善したことで、デッドロックや不正な状態遷移を防止しました。
想定される影響	・当該ソフトウェアが扱う情報について、外部への漏えいは発生しません。・当該ソフトウェアが扱う情報について、書き換えは発生しません。・当該ソフトウェアが完全に停止する可能性があります。
対策	リリース情報、またはパッチ情報が公開されています。参考情報を参照して適切な対策を実施してください。
公表日	2026年5月28日0:00
登録日	2026年6月12日14:48
最終更新日	2026年6月12日14:48

CVSS3.0 : 警告
スコア	5.5
ベクター	CVSS:3.0/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

影響を受けるシステム

Linux

Linux Kernel 6.19.12 以上 7.0 未満

Linux Kernel 7.0

Linux Kernel 7.0.1 以上 7.0.9 未満

Linux Kernel 7.1

CVE (情報セキュリティ共通脆弱性識別子)

No	名前	URL
Common Vulnerabilities and Exposures (CVE)
1	CVE-2026-46223	https://www.cve.org/CVERecord?id=CVE-2026-46223
National Vulnerability Database (NVD)
2	CVE-2026-46223	https://nvd.nist.gov/vuln/detail/CVE-2026-46223

CWE (共通脆弱性タイプ一覧)

No	名前	URL
JVNDB
1	CWE-667	http://cwe.mitre.org/data/definitions/667.html

その他

No	名前	URL
関連文書
1	cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated - kernel/git/stable/linux.git - Linux kernel stable tree (https://git.kernel.org/stable/c/33fa2e6b1507a0a377a151a8826438bedad1d0b0)	https://git.kernel.org/stable/c/33fa2e6b1507a0a377a151a8826438bedad1d0b0
2	cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated - kernel/git/stable/linux.git - Linux kernel stable tree (https://git.kernel.org/stable/c/93618edf753838a727dbff63c7c291dee22d656b)	https://git.kernel.org/stable/c/93618edf753838a727dbff63c7c291dee22d656b

変更履歴

No	変更内容	変更日
1	[2026年06月12日] 掲載	2026年6月12日14:48

NVD脆弱性情報

CVE-2026-46223

概要	In the Linux kernel, the following vulnerability has been resolved: cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated A chain of commits going back to v7.0 reworked rmdir to satisfy the controller invariant that a subsystem's ->css_offline() must not run while tasks are still doing kernel-side work in the cgroup. [1] d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out") [2] a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup") [3] 1b164b876c36 ("cgroup: Wait for dying tasks to leave on rmdir") [4] 4c56a8ac6869 ("cgroup: Fix cgroup_drain_dying() testing the wrong condition") [5] 13e786b64bd3 ("cgroup: Increment nr_dying_subsys_* from rmdir context") [1] moved task cset unlink from do_exit() to finish_task_switch() so a task's cset link drops only after the task has fully stopped scheduling. That made tasks past exit_signals() linger on cset->tasks until their final context switch, which led to a series of problems as what userspace expected to see after rmdir diverged from what the kernel needs to wait for. [2]-[5] tried to bridge that divergence: [2] filtered the exiting tasks from cgroup.procs; [3] had rmdir(2) sleep in TASK_UNINTERRUPTIBLE for them; [4] fixed the wait's condition; [5] made nr_dying_subsys_* visible synchronously. The cgroup_drain_dying() wait in [3] turned out to be a dead end. When the rmdir caller is also the reaper of a zombie that pins a pidns teardown (e.g. host PID 1 systemd reaping orphan pids that were re-parented to it during the same teardown), rmdir blocks in TASK_UNINTERRUPTIBLE waiting for those pids to free, the pids can't free because PID 1 is the reaper and it's stuck in rmdir, and the system A-A deadlocks. No internal lock ordering breaks this; the wait itself is the bug. The css killing side that drove the original reorder, however, can be made cleanly asynchronous: ->css_offline() is already async, run from css_killed_work_fn() driven by percpu_ref_kill_and_confirm(). The fix is to make that chain start only after all tasks have left the cgroup. rmdir's user-visible side then returns as soon as cgroup.procs and friends are empty, while ->css_offline() still runs only after the cgroup is fully drained. Verified by the original reproducer (pidns teardown + zombie reaper, runs under vng) which hangs vanilla and succeeds here, and by per-commit deterministic repros for [2], [3], [4], [5] with a boot parameter that widens the post-exit_signals() window so each state is reliably reachable. Some stress tests on top of that. cgroup_apply_control_disable() has the same shape of pre-existing race: when a controller is disabled via subtree_control, kill_css() ran synchronously while tasks past exit_signals() could still be linked to the cgroup's csets, and ->css_offline() could fire before they drained. This patch preserves the existing synchronous behavior at that call site (kill_css_sync() + kill_css_finish() back-to-back) and a follow-up patch will defer kill_css_finish() there using a per-css trigger. This seems like the right approach and I don't see problems with it. The changes are somewhat invasive but not excessively so, so backporting to -stable should be okay. If something does turn out to be wrong, the fallback is to revert the entire chain ([1]-[5]) and rework in the development branch instead. v2: Pin cgrp across the deferred destroy work with explicit cgroup_get()/cgroup_put() around queue_work() and the work_fn. v1 wasn't actually broken (ordered cgroup_offline_wq + queue_work order in cgroup_task_dead() saved it) but the explicit ref removes the dependency on those non-obvious invariants. Also note the pre-existing cgroup_apply_control_disable() race in the description; a follow-up will defer kill_css_finish() there.
公表日	2026年5月28日19:16
登録日	2026年5月29日4:14
最終更新日	2026年5月28日22:44

関連情報、対策とツール

No	URL	refsource	タグ
1	https://git.kernel.org/stable/c/33fa2e6b1507a0a377a151a8826438bedad1d0b0	416baaa9-dc9f-4396-8d5f-8c081fb06d67
2	https://git.kernel.org/stable/c/93618edf753838a727dbff63c7c291dee22d656b	416baaa9-dc9f-4396-8d5f-8c081fb06d67

共通脆弱性一覧