[v5,01/10] capabilities: introduce CAP_PERFMON to kernel and user space (2024)

diff mbox series

Message ID	9b77124b-675d-5ac7-3741-edec575bd425@linux.intel.com (mailing list archive)
State	New, archived
Headers	show
Series	Introduce CAP_PERFMON to secure system performance monitoring and observability \| expand

Commit Message

Alexey Budankov Jan. 20, 2020, 11:23 a.m. UTC

Introduce CAP_PERFMON capability designed to secure system performancemonitoring and observability operations so that CAP_PERFMON would assistCAP_SYS_ADMIN capability in its governing role for perf_events, i915_perfand other performance monitoring and observability subsystems.CAP_PERFMON intends to harden system security and integrity during systemperformance monitoring and observability operations by decreasing attacksurface that is available to a CAP_SYS_ADMIN privileged process [1].Providing access to system performance monitoring and observabilityoperations under CAP_PERFMON capability singly, without the rest ofCAP_SYS_ADMIN credentials, excludes chances to misuse the credentials andmakes operation more secure.CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related tosystem performance monitoring and observability operations and balanceamount of CAP_SYS_ADMIN credentials following the recommendations in thecapabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability isoverloaded; see Notes to kernel developers, below."Although the software running under CAP_PERFMON can not ensure avoidanceof related hardware issues, the software can still mitigate these issuesfollowing the official embargoed hardware issues mitigation procedure [2].The bugs in the software itself could be fixed following the standardkernel development process [3] to maintain and harden security of systemperformance monitoring and observability operations.[1] http://man7.org/linux/man-pages/man7/capabilities.7.html[2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html[3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.htmlSigned-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>--- include/linux/capability.h | 12 ++++++++++++ include/uapi/linux/capability.h | 8 +++++++- security/selinux/include/classmap.h | 4 ++-- 3 files changed, 21 insertions(+), 3 deletions(-)

Comments

Stephen Smalley Jan. 21, 2020, 2:43 p.m. UTC | #1

On 1/20/20 6:23 AM, Alexey Budankov wrote:> > Introduce CAP_PERFMON capability designed to secure system performance> monitoring and observability operations so that CAP_PERFMON would assist> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf> and other performance monitoring and observability subsystems.> > CAP_PERFMON intends to harden system security and integrity during system> performance monitoring and observability operations by decreasing attack> surface that is available to a CAP_SYS_ADMIN privileged process [1].> Providing access to system performance monitoring and observability> operations under CAP_PERFMON capability singly, without the rest of> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and> makes operation more secure.> > CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to> system performance monitoring and observability operations and balance> amount of CAP_SYS_ADMIN credentials following the recommendations in the> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is> overloaded; see Notes to kernel developers, below."> > Although the software running under CAP_PERFMON can not ensure avoidance> of related hardware issues, the software can still mitigate these issues> following the official embargoed hardware issues mitigation procedure [2].> The bugs in the software itself could be fixed following the standard> kernel development process [3] to maintain and harden security of system> performance monitoring and observability operations.> > [1] http://man7.org/linux/man-pages/man7/capabilities.7.html> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html> > Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>> ---> include/linux/capability.h | 12 ++++++++++++> include/uapi/linux/capability.h | 8 +++++++-> security/selinux/include/classmap.h | 4 ++--> 3 files changed, 21 insertions(+), 3 deletions(-)> > diff --git a/include/linux/capability.h b/include/linux/capability.h> index ecce0f43c73a..8784969d91e1 100644> --- a/include/linux/capability.h> +++ b/include/linux/capability.h> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);> +static inline bool perfmon_capable(void)> +{> +struct user_namespace *ns = &init_user_ns;> +> +if (ns_capable_noaudit(ns, CAP_PERFMON))> +return ns_capable(ns, CAP_PERFMON);> +> +if (ns_capable_noaudit(ns, CAP_SYS_ADMIN))> +return ns_capable(ns, CAP_SYS_ADMIN);> +> +return false;> +}Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.

Alexey Budankov Jan. 21, 2020, 5:30 p.m. UTC | #2

On 21.01.2020 17:43, Stephen Smalley wrote:> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>> Introduce CAP_PERFMON capability designed to secure system performance>> monitoring and observability operations so that CAP_PERFMON would assist>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf>> and other performance monitoring and observability subsystems.>>>> CAP_PERFMON intends to harden system security and integrity during system>> performance monitoring and observability operations by decreasing attack>> surface that is available to a CAP_SYS_ADMIN privileged process [1].>> Providing access to system performance monitoring and observability>> operations under CAP_PERFMON capability singly, without the rest of>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and>> makes operation more secure.>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to>> system performance monitoring and observability operations and balance>> amount of CAP_SYS_ADMIN credentials following the recommendations in the>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is>> overloaded; see Notes to kernel developers, below.">>>> Although the software running under CAP_PERFMON can not ensure avoidance>> of related hardware issues, the software can still mitigate these issues>> following the official embargoed hardware issues mitigation procedure [2].>> The bugs in the software itself could be fixed following the standard>> kernel development process [3] to maintain and harden security of system>> performance monitoring and observability operations.>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>>> --->> include/linux/capability.h | 12 ++++++++++++>> include/uapi/linux/capability.h | 8 +++++++->> security/selinux/include/classmap.h | 4 ++-->> 3 files changed, 21 insertions(+), 3 deletions(-)>>>> diff --git a/include/linux/capability.h b/include/linux/capability.h>> index ecce0f43c73a..8784969d91e1 100644>> --- a/include/linux/capability.h>> +++ b/include/linux/capability.h>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);>> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);>> +static inline bool perfmon_capable(void)>> +{>> + struct user_namespace *ns = &init_user_ns;>> +>> + if (ns_capable_noaudit(ns, CAP_PERFMON))>> + return ns_capable(ns, CAP_PERFMON);>> +>> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN))>> + return ns_capable(ns, CAP_SYS_ADMIN);>> +>> + return false;>> +}> > Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.Some of ideas from v4 review.Well, on the second sight, it defenitly should be logged for CAP_SYS_ADMIN.Probably it is not so fatal for CAP_PERFMON, but personally I would unconditionally log it for CAP_PERFMON as well.Good catch, thank you.~Alexey

Alexei Starovoitov Jan. 21, 2020, 5:55 p.m. UTC | #3

On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov<alexey.budankov@linux.intel.com> wrote:>>> On 21.01.2020 17:43, Stephen Smalley wrote:> > On 1/20/20 6:23 AM, Alexey Budankov wrote:> >>> >> Introduce CAP_PERFMON capability designed to secure system performance> >> monitoring and observability operations so that CAP_PERFMON would assist> >> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf> >> and other performance monitoring and observability subsystems.> >>> >> CAP_PERFMON intends to harden system security and integrity during system> >> performance monitoring and observability operations by decreasing attack> >> surface that is available to a CAP_SYS_ADMIN privileged process [1].> >> Providing access to system performance monitoring and observability> >> operations under CAP_PERFMON capability singly, without the rest of> >> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and> >> makes operation more secure.> >>> >> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to> >> system performance monitoring and observability operations and balance> >> amount of CAP_SYS_ADMIN credentials following the recommendations in the> >> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is> >> overloaded; see Notes to kernel developers, below."> >>> >> Although the software running under CAP_PERFMON can not ensure avoidance> >> of related hardware issues, the software can still mitigate these issues> >> following the official embargoed hardware issues mitigation procedure [2].> >> The bugs in the software itself could be fixed following the standard> >> kernel development process [3] to maintain and harden security of system> >> performance monitoring and observability operations.> >>> >> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html> >> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html> >> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html> >>> >> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>> >> ---> >> include/linux/capability.h | 12 ++++++++++++> >> include/uapi/linux/capability.h | 8 +++++++-> >> security/selinux/include/classmap.h | 4 ++--> >> 3 files changed, 21 insertions(+), 3 deletions(-)> >>> >> diff --git a/include/linux/capability.h b/include/linux/capability.h> >> index ecce0f43c73a..8784969d91e1 100644> >> --- a/include/linux/capability.h> >> +++ b/include/linux/capability.h> >> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct> >> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);> >> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);> >> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);> >> +static inline bool perfmon_capable(void)> >> +{> >> + struct user_namespace *ns = &init_user_ns;> >> +> >> + if (ns_capable_noaudit(ns, CAP_PERFMON))> >> + return ns_capable(ns, CAP_PERFMON);> >> +> >> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN))> >> + return ns_capable(ns, CAP_SYS_ADMIN);> >> +> >> + return false;> >> +}> >> > Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>> Some of ideas from v4 review.well, in the requested changes form v4 I wrote:return capable(CAP_PERFMON);instead ofreturn false;That's what Andy suggested earlier for CAP_BPF.I think that should resolve Stephen's concern.

Alexey Budankov Jan. 21, 2020, 6:27 p.m. UTC | #4

On 21.01.2020 20:55, Alexei Starovoitov wrote:> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov> <alexey.budankov@linux.intel.com> wrote:>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>> monitoring and observability operations so that CAP_PERFMON would assist>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf>>>> and other performance monitoring and observability subsystems.>>>>>>>> CAP_PERFMON intends to harden system security and integrity during system>>>> performance monitoring and observability operations by decreasing attack>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1].>>>> Providing access to system performance monitoring and observability>>>> operations under CAP_PERFMON capability singly, without the rest of>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and>>>> makes operation more secure.>>>>>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to>>>> system performance monitoring and observability operations and balance>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is>>>> overloaded; see Notes to kernel developers, below.">>>>>>>> Although the software running under CAP_PERFMON can not ensure avoidance>>>> of related hardware issues, the software can still mitigate these issues>>>> following the official embargoed hardware issues mitigation procedure [2].>>>> The bugs in the software itself could be fixed following the standard>>>> kernel development process [3] to maintain and harden security of system>>>> performance monitoring and observability operations.>>>>>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html>>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>>>>> --->>>> include/linux/capability.h | 12 ++++++++++++>>>> include/uapi/linux/capability.h | 8 +++++++->>>> security/selinux/include/classmap.h | 4 ++-->>>> 3 files changed, 21 insertions(+), 3 deletions(-)>>>>>>>> diff --git a/include/linux/capability.h b/include/linux/capability.h>>>> index ecce0f43c73a..8784969d91e1 100644>>>> --- a/include/linux/capability.h>>>> +++ b/include/linux/capability.h>>>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct>>>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);>>>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);>>>> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);>>>> +static inline bool perfmon_capable(void)>>>> +{>>>> + struct user_namespace *ns = &init_user_ns;>>>> +>>>> + if (ns_capable_noaudit(ns, CAP_PERFMON))>>>> + return ns_capable(ns, CAP_PERFMON);>>>> +>>>> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN))>>>> + return ns_capable(ns, CAP_SYS_ADMIN);>>>> +>>>> + return false;>>>> +}>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>> Some of ideas from v4 review.> > well, in the requested changes form v4 I wrote:> return capable(CAP_PERFMON);> instead of> return false;Aww, indeed. I was concerning exactly about it when updating the patchand simply put false, missing the fact that capable() also logs.I suppose the idea is originally from here [1].BTW, Has it already seen any _more optimal_ implementation?Anyway, original or optimized version could be reused for CAP_PERFMON.~Alexey[1] https://patchwork.ozlabs.org/patch/1159243/> > That's what Andy suggested earlier for CAP_BPF.> I think that should resolve Stephen's concern.>

Alexey Budankov Jan. 22, 2020, 10:45 a.m. UTC | #5

On 21.01.2020 21:27, Alexey Budankov wrote:> > On 21.01.2020 20:55, Alexei Starovoitov wrote:>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>> monitoring and observability operations so that CAP_PERFMON would assist>>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf>>>>> and other performance monitoring and observability subsystems.>>>>>>>>>> CAP_PERFMON intends to harden system security and integrity during system>>>>> performance monitoring and observability operations by decreasing attack>>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1].>>>>> Providing access to system performance monitoring and observability>>>>> operations under CAP_PERFMON capability singly, without the rest of>>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and>>>>> makes operation more secure.>>>>>>>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to>>>>> system performance monitoring and observability operations and balance>>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the>>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is>>>>> overloaded; see Notes to kernel developers, below.">>>>>>>>>> Although the software running under CAP_PERFMON can not ensure avoidance>>>>> of related hardware issues, the software can still mitigate these issues>>>>> following the official embargoed hardware issues mitigation procedure [2].>>>>> The bugs in the software itself could be fixed following the standard>>>>> kernel development process [3] to maintain and harden security of system>>>>> performance monitoring and observability operations.>>>>>>>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html>>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html>>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html>>>>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>>>>>> --->>>>> include/linux/capability.h | 12 ++++++++++++>>>>> include/uapi/linux/capability.h | 8 +++++++->>>>> security/selinux/include/classmap.h | 4 ++-->>>>> 3 files changed, 21 insertions(+), 3 deletions(-)>>>>>>>>>> diff --git a/include/linux/capability.h b/include/linux/capability.h>>>>> index ecce0f43c73a..8784969d91e1 100644>>>>> --- a/include/linux/capability.h>>>>> +++ b/include/linux/capability.h>>>>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct>>>>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);>>>>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);>>>>> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);>>>>> +static inline bool perfmon_capable(void)>>>>> +{>>>>> + struct user_namespace *ns = &init_user_ns;>>>>> +>>>>> + if (ns_capable_noaudit(ns, CAP_PERFMON))>>>>> + return ns_capable(ns, CAP_PERFMON);>>>>> +>>>>> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN))>>>>> + return ns_capable(ns, CAP_SYS_ADMIN);>>>>> +>>>>> + return false;>>>>> +}>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.So far so good, I suggest using the simplest version for v6:static inline bool perfmon_capable(void){return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);}It keeps the implementation simple and readable. The implementation is moreperformant in the sense of calling the API - one capable() call for CAP_PERFMONprivileged process.Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,but this bloating also advertises and leverages using more secure CAP_PERFMONbased approach to use perf_event_open system call.~Alexey>>>>>> Some of ideas from v4 review.>>>> well, in the requested changes form v4 I wrote:>> return capable(CAP_PERFMON);>> instead of>> return false;> > Aww, indeed. I was concerning exactly about it when updating the patch> and simply put false, missing the fact that capable() also logs.> > I suppose the idea is originally from here [1].> BTW, Has it already seen any _more optimal_ implementation?> Anyway, original or optimized version could be reused for CAP_PERFMON.> > ~Alexey> > [1] https://patchwork.ozlabs.org/patch/1159243/> >>>> That's what Andy suggested earlier for CAP_BPF.>> I think that should resolve Stephen's concern.>>

Stephen Smalley Jan. 22, 2020, 2:07 p.m. UTC | #6

On 1/22/20 5:45 AM, Alexey Budankov wrote:> > On 21.01.2020 21:27, Alexey Budankov wrote:>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>> monitoring and observability operations so that CAP_PERFMON would assist>>>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf>>>>>> and other performance monitoring and observability subsystems.>>>>>>>>>>>> CAP_PERFMON intends to harden system security and integrity during system>>>>>> performance monitoring and observability operations by decreasing attack>>>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1].>>>>>> Providing access to system performance monitoring and observability>>>>>> operations under CAP_PERFMON capability singly, without the rest of>>>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and>>>>>> makes operation more secure.>>>>>>>>>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to>>>>>> system performance monitoring and observability operations and balance>>>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the>>>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is>>>>>> overloaded; see Notes to kernel developers, below.">>>>>>>>>>>> Although the software running under CAP_PERFMON can not ensure avoidance>>>>>> of related hardware issues, the software can still mitigate these issues>>>>>> following the official embargoed hardware issues mitigation procedure [2].>>>>>> The bugs in the software itself could be fixed following the standard>>>>>> kernel development process [3] to maintain and harden security of system>>>>>> performance monitoring and observability operations.>>>>>>>>>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html>>>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html>>>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html>>>>>>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>>>>>>> --->>>>>> include/linux/capability.h | 12 ++++++++++++>>>>>> include/uapi/linux/capability.h | 8 +++++++->>>>>> security/selinux/include/classmap.h | 4 ++-->>>>>> 3 files changed, 21 insertions(+), 3 deletions(-)>>>>>>>>>>>> diff --git a/include/linux/capability.h b/include/linux/capability.h>>>>>> index ecce0f43c73a..8784969d91e1 100644>>>>>> --- a/include/linux/capability.h>>>>>> +++ b/include/linux/capability.h>>>>>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct>>>>>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);>>>>>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);>>>>>> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);>>>>>> +static inline bool perfmon_capable(void)>>>>>> +{>>>>>> + struct user_namespace *ns = &init_user_ns;>>>>>> +>>>>>> + if (ns_capable_noaudit(ns, CAP_PERFMON))>>>>>> + return ns_capable(ns, CAP_PERFMON);>>>>>> +>>>>>> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN))>>>>>> + return ns_capable(ns, CAP_SYS_ADMIN);>>>>>> +>>>>>> + return false;>>>>>> +}>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.> > So far so good, I suggest using the simplest version for v6:> > static inline bool perfmon_capable(void)> {> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);> }> > It keeps the implementation simple and readable. The implementation is more> performant in the sense of calling the API - one capable() call for CAP_PERFMON> privileged process.> > Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,> but this bloating also advertises and leverages using more secure CAP_PERFMON> based approach to use perf_event_open system call.I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.

Alexey Budankov Jan. 22, 2020, 2:25 p.m. UTC | #7

On 22.01.2020 17:07, Stephen Smalley wrote:> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>> monitoring and observability operations so that CAP_PERFMON would assist>>>>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf>>>>>>> and other performance monitoring and observability subsystems.>>>>>>>>>>>>>> CAP_PERFMON intends to harden system security and integrity during system>>>>>>> performance monitoring and observability operations by decreasing attack>>>>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1].>>>>>>> Providing access to system performance monitoring and observability>>>>>>> operations under CAP_PERFMON capability singly, without the rest of>>>>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and>>>>>>> makes operation more secure.>>>>>>>>>>>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to>>>>>>> system performance monitoring and observability operations and balance>>>>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the>>>>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is>>>>>>> overloaded; see Notes to kernel developers, below.">>>>>>>>>>>>>> Although the software running under CAP_PERFMON can not ensure avoidance>>>>>>> of related hardware issues, the software can still mitigate these issues>>>>>>> following the official embargoed hardware issues mitigation procedure [2].>>>>>>> The bugs in the software itself could be fixed following the standard>>>>>>> kernel development process [3] to maintain and harden security of system>>>>>>> performance monitoring and observability operations.>>>>>>>>>>>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html>>>>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html>>>>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html>>>>>>>>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>>>>>>>> --->>>>>>> include/linux/capability.h | 12 ++++++++++++>>>>>>> include/uapi/linux/capability.h | 8 +++++++->>>>>>> security/selinux/include/classmap.h | 4 ++-->>>>>>> 3 files changed, 21 insertions(+), 3 deletions(-)>>>>>>>>>>>>>> diff --git a/include/linux/capability.h b/include/linux/capability.h>>>>>>> index ecce0f43c73a..8784969d91e1 100644>>>>>>> --- a/include/linux/capability.h>>>>>>> +++ b/include/linux/capability.h>>>>>>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct>>>>>>> extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);>>>>>>> extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);>>>>>>> extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);>>>>>>> +static inline bool perfmon_capable(void)>>>>>>> +{>>>>>>> + struct user_namespace *ns = &init_user_ns;>>>>>>> +>>>>>>> + if (ns_capable_noaudit(ns, CAP_PERFMON))>>>>>>> + return ns_capable(ns, CAP_PERFMON);>>>>>>> +>>>>>>> + if (ns_capable_noaudit(ns, CAP_SYS_ADMIN))>>>>>>> + return ns_capable(ns, CAP_SYS_ADMIN);>>>>>>> +>>>>>>> + return false;>>>>>>> +}>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>> So far so good, I suggest using the simplest version for v6:>>>> static inline bool perfmon_capable(void)>> {>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>> }>>>> It keeps the implementation simple and readable. The implementation is more>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>> privileged process.>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>> but this bloating also advertises and leverages using more secure CAP_PERFMON>> based approach to use perf_event_open system call.> > I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.perf security [1] document can be updated, at least, to align and document this audit logging specifics.~Alexey[1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html

Alexey Budankov Feb. 6, 2020, 6:03 p.m. UTC | #8

On 22.01.2020 17:25, Alexey Budankov wrote:> > On 22.01.2020 17:07, Stephen Smalley wrote:>> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>>> monitoring and observability operations so that CAP_PERFMON would assist>>>>>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf>>>>>>>> and other performance monitoring and observability subsystems.>>>>>>>>>>>>>>>> CAP_PERFMON intends to harden system security and integrity during system>>>>>>>> performance monitoring and observability operations by decreasing attack>>>>>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1].>>>>>>>> Providing access to system performance monitoring and observability>>>>>>>> operations under CAP_PERFMON capability singly, without the rest of>>>>>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and>>>>>>>> makes operation more secure.>>>>>>>>>>>>>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to>>>>>>>> system performance monitoring and observability operations and balance>>>>>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the>>>>>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is>>>>>>>> overloaded; see Notes to kernel developers, below.">>>>>>>>>>>>>>>> Although the software running under CAP_PERFMON can not ensure avoidance>>>>>>>> of related hardware issues, the software can still mitigate these issues>>>>>>>> following the official embargoed hardware issues mitigation procedure [2].>>>>>>>> The bugs in the software itself could be fixed following the standard>>>>>>>> kernel development process [3] to maintain and harden security of system>>>>>>>> performance monitoring and observability operations.>>>>>>>>>>>>>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html>>>>>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html>>>>>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html<SNIP>>>>>>>>>>>>>>>>> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>>>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>>>> So far so good, I suggest using the simplest version for v6:>>>>>> static inline bool perfmon_capable(void)>>> {>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>>> }>>>>>> It keeps the implementation simple and readable. The implementation is more>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>> privileged process.>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>> based approach to use perf_event_open system call.>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.> > perf security [1] document can be updated, at least, to align and document > this audit logging specifics.And I plan to update the document right after this patch set is accepted.Feel free to let me know of the places in the kernel docs that alsorequire update w.r.t CAP_PERFMON extension.~Alexey> > ~Alexey> > [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html>

Thomas Gleixner Feb. 7, 2020, 11:38 a.m. UTC | #9

Alexey Budankov <alexey.budankov@linux.intel.com> writes:> On 22.01.2020 17:25, Alexey Budankov wrote:>> On 22.01.2020 17:07, Stephen Smalley wrote:>>>> It keeps the implementation simple and readable. The implementation is more>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>>> privileged process.>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>>> based approach to use perf_event_open system call.>>>>>> I can live with that. We just need to document that when you see>>> both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process,>>> try only allowing CAP_PERFMON first and see if that resolves the>>> issue. We have a similar issue with CAP_DAC_READ_SEARCH versus>>> CAP_DAC_OVERRIDE.>> >> perf security [1] document can be updated, at least, to align and document >> this audit logging specifics.>> And I plan to update the document right after this patch set is accepted.> Feel free to let me know of the places in the kernel docs that also> require update w.r.t CAP_PERFMON extension.The documentation update wants be part of the patch set and not plannedto be done _after_ the patch set is merged.Thanks, tglx

Alexey Budankov Feb. 7, 2020, 1:39 p.m. UTC | #10

On 07.02.2020 14:38, Thomas Gleixner wrote:> Alexey Budankov <alexey.budankov@linux.intel.com> writes:>> On 22.01.2020 17:25, Alexey Budankov wrote:>>> On 22.01.2020 17:07, Stephen Smalley wrote:>>>>> It keeps the implementation simple and readable. The implementation is more>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>>>> privileged process.>>>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>>>> based approach to use perf_event_open system call.>>>>>>>> I can live with that. We just need to document that when you see>>>> both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process,>>>> try only allowing CAP_PERFMON first and see if that resolves the>>>> issue. We have a similar issue with CAP_DAC_READ_SEARCH versus>>>> CAP_DAC_OVERRIDE.>>>>>> perf security [1] document can be updated, at least, to align and document >>> this audit logging specifics.>>>> And I plan to update the document right after this patch set is accepted.>> Feel free to let me know of the places in the kernel docs that also>> require update w.r.t CAP_PERFMON extension.> > The documentation update wants be part of the patch set and not planned> to be done _after_ the patch set is merged.Well, accepted. It is going to make patches #11 and beyond.Thanks,Alexey> > Thanks,> > tglx>

Alexey Budankov Feb. 12, 2020, 8:53 a.m. UTC | #11

Hi Stephen,On 22.01.2020 17:07, Stephen Smalley wrote:> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>><SNIP>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>> So far so good, I suggest using the simplest version for v6:>>>> static inline bool perfmon_capable(void)>> {>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>> }>>>> It keeps the implementation simple and readable. The implementation is more>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>> privileged process.>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>> but this bloating also advertises and leverages using more secure CAP_PERFMON>> based approach to use perf_event_open system call.> > I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.I am trying to reproduce this double logging with CAP_PERFMON.I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.When running perf stat -a I am observing this AVC audit messages:type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1However there is no capability related messages around. I suppose my refpolicy should be modified somehow to observe capability related AVCs.Could you please comment or clarify on how to enable caps related AVCs in orderto test the concerned logging.Thanks,Alexey---[1] https://github.com/SELinuxProject/refpolicy.git

Stephen Smalley Feb. 12, 2020, 1:32 p.m. UTC | #12

On 2/12/20 3:53 AM, Alexey Budankov wrote:> Hi Stephen,> > On 22.01.2020 17:07, Stephen Smalley wrote:>> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>> <SNIP>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>>>> So far so good, I suggest using the simplest version for v6:>>>>>> static inline bool perfmon_capable(void)>>> {>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>>> }>>>>>> It keeps the implementation simple and readable. The implementation is more>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>> privileged process.>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>> based approach to use perf_event_open system call.>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.> > I am trying to reproduce this double logging with CAP_PERFMON.> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.> When running perf stat -a I am observing this AVC audit messages:> > type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1> > However there is no capability related messages around. I suppose my refpolicy should> be modified somehow to observe capability related AVCs.> > Could you please comment or clarify on how to enable caps related AVCs in order> to test the concerned logging.The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.

Alexey Budankov Feb. 12, 2020, 1:53 p.m. UTC | #13

On 12.02.2020 16:32, Stephen Smalley wrote:> On 2/12/20 3:53 AM, Alexey Budankov wrote:>> Hi Stephen,>>>> On 22.01.2020 17:07, Stephen Smalley wrote:>>> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>> <SNIP>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>>>>>> So far so good, I suggest using the simplest version for v6:>>>>>>>> static inline bool perfmon_capable(void)>>>> {>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>>>> }>>>>>>>> It keeps the implementation simple and readable. The implementation is more>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>>> privileged process.>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>>> based approach to use perf_event_open system call.>>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.>>>> I am trying to reproduce this double logging with CAP_PERFMON.>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.>> When running perf stat -a I am observing this AVC audit messages:>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>> However there is no capability related messages around. I suppose my refpolicy should>> be modified somehow to observe capability related AVCs.>>>> Could you please comment or clarify on how to enable caps related AVCs in order>> to test the concerned logging.> > The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.Yes, I already have it like this:common cap2{<------>mac_override<--># unused by SELinux<------>mac_admin<------>syslog<------>wake_alarm<------>block_suspend<------>audit_read<------>perfmon}dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials.BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it.~Alexey> >

Stephen Smalley Feb. 12, 2020, 3:21 p.m. UTC | #14

On 2/12/20 8:53 AM, Alexey Budankov wrote:> On 12.02.2020 16:32, Stephen Smalley wrote:>> On 2/12/20 3:53 AM, Alexey Budankov wrote:>>> Hi Stephen,>>>>>> On 22.01.2020 17:07, Stephen Smalley wrote:>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>>>> <SNIP>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>>>>>>>> So far so good, I suggest using the simplest version for v6:>>>>>>>>>> static inline bool perfmon_capable(void)>>>>> {>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>>>>> }>>>>>>>>>> It keeps the implementation simple and readable. The implementation is more>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>>>> privileged process.>>>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>>>> based approach to use perf_event_open system call.>>>>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.>>>>>> I am trying to reproduce this double logging with CAP_PERFMON.>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.>>> When running perf stat -a I am observing this AVC audit messages:>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>> However there is no capability related messages around. I suppose my refpolicy should>>> be modified somehow to observe capability related AVCs.>>>>>> Could you please comment or clarify on how to enable caps related AVCs in order>>> to test the concerned logging.>>>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.> > Yes, I already have it like this:> common cap2> {> <------>mac_override<--># unused by SELinux> <------>mac_admin> <------>syslog> <------>wake_alarm> <------>block_suspend> <------>audit_read> <------>perfmon> }> > dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials.> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it.Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module.

Stephen Smalley Feb. 12, 2020, 3:45 p.m. UTC | #15

On 2/12/20 10:21 AM, Stephen Smalley wrote:> On 2/12/20 8:53 AM, Alexey Budankov wrote:>> On 12.02.2020 16:32, Stephen Smalley wrote:>>> On 2/12/20 3:53 AM, Alexey Budankov wrote:>>>> Hi Stephen,>>>>>>>> On 22.01.2020 17:07, Stephen Smalley wrote:>>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>>>>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>>>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>>>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>>>>>> <SNIP>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system >>>>>>>>>>> performance>>>>>>>>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure >>>>>>>>>> is non-fatal to the operation. Otherwise, we want the audit >>>>>>>>>> message.>>>>>>>>>>>> So far so good, I suggest using the simplest version for v6:>>>>>>>>>>>> static inline bool perfmon_capable(void)>>>>>> {>>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>>>>>> }>>>>>>>>>>>> It keeps the implementation simple and readable. The >>>>>> implementation is more>>>>>> performant in the sense of calling the API - one capable() call >>>>>> for CAP_PERFMON>>>>>> privileged process.>>>>>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and >>>>>> unprivileged processes,>>>>>> but this bloating also advertises and leverages using more secure >>>>>> CAP_PERFMON>>>>>> based approach to use perf_event_open system call.>>>>>>>>>> I can live with that. We just need to document that when you see >>>>> both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, >>>>> try only allowing CAP_PERFMON first and see if that resolves the >>>>> issue. We have a similar issue with CAP_DAC_READ_SEARCH versus >>>>> CAP_DAC_OVERRIDE.>>>>>>>> I am trying to reproduce this double logging with CAP_PERFMON.>>>> I am using the refpolicy version with enabled perf_event tclass [1], >>>> in permissive mode.>>>> When running perf stat -a I am observing this AVC audit messages:>>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for >>>> pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t >>>> tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } >>>> for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t >>>> tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for >>>> pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t >>>> tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } >>>> for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t >>>> tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>>>> However there is no capability related messages around. I suppose my >>>> refpolicy should>>>> be modified somehow to observe capability related AVCs.>>>>>>>> Could you please comment or clarify on how to enable caps related >>>> AVCs in order>>>> to test the concerned logging.>>>>>> The new perfmon permission has to be defined in your policy; you'll >>> have a message in dmesg about "Permission perfmon in class >>> capability2 not defined in policy.". You can either add it to the >>> common cap2 definition in refpolicy/policy/flask/access_vectors and >>> rebuild your policy or extract your base module as CIL, add it there, >>> and insert the updated module.>>>> Yes, I already have it like this:>> common cap2>> {>> <------>mac_override<--># unused by SELinux>> <------>mac_admin>> <------>syslog>> <------>wake_alarm>> <------>block_suspend>> <------>audit_read>> <------>perfmon>> }>>>> dmesg stopped reporting perfmon as not defined but audit.log still >> doesn't report CAP_PERFMON denials.>> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however >> perfmon_capable() does check for it.> > Some denials may be silenced by dontaudit rules; semodule -DB will strip > those and semodule -B will restore them. Other possibility is that the > process doesn't have CAP_PERFMON in its effective set and therefore > never reaches SELinux at all; denied first by the capability module.Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that).

Alexey Budankov Feb. 12, 2020, 4:16 p.m. UTC | #16

On 12.02.2020 18:21, Stephen Smalley wrote:> On 2/12/20 8:53 AM, Alexey Budankov wrote:>> On 12.02.2020 16:32, Stephen Smalley wrote:>>> On 2/12/20 3:53 AM, Alexey Budankov wrote:>>>> Hi Stephen,>>>>>>>> On 22.01.2020 17:07, Stephen Smalley wrote:>>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>>>>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>>>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>>>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>>>>>> <SNIP>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>>>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>>>>>>>>>> So far so good, I suggest using the simplest version for v6:>>>>>>>>>>>> static inline bool perfmon_capable(void)>>>>>> {>>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>>>>>> }>>>>>>>>>>>> It keeps the implementation simple and readable. The implementation is more>>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>>>>> privileged process.>>>>>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>>>>> based approach to use perf_event_open system call.>>>>>>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.>>>>>>>> I am trying to reproduce this double logging with CAP_PERFMON.>>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.>>>> When running perf stat -a I am observing this AVC audit messages:>>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>>>> However there is no capability related messages around. I suppose my refpolicy should>>>> be modified somehow to observe capability related AVCs.>>>>>>>> Could you please comment or clarify on how to enable caps related AVCs in order>>>> to test the concerned logging.>>>>>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.>>>> Yes, I already have it like this:>> common cap2>> {>> <------>mac_override<--># unused by SELinux>> <------>mac_admin>> <------>syslog>> <------>wake_alarm>> <------>block_suspend>> <------>audit_read>> <------>perfmon>> }>>>> dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials.>> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it.> > Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module.Yes, that all makes sense.selinux_capable() calls avc_audit() logging but cap_capable() doesn't, so proper order matters.I am doing debug tracing of the kernel code to reveal the exact reasons.~Alexey

Alexey Budankov Feb. 12, 2020, 4:56 p.m. UTC | #17

On 12.02.2020 18:45, Stephen Smalley wrote:> On 2/12/20 10:21 AM, Stephen Smalley wrote:>> On 2/12/20 8:53 AM, Alexey Budankov wrote:>>> On 12.02.2020 16:32, Stephen Smalley wrote:>>>> On 2/12/20 3:53 AM, Alexey Budankov wrote:>>>>> Hi Stephen,>>>>>>>>>> On 22.01.2020 17:07, Stephen Smalley wrote:>>>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>>>>>>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>>>>>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>>>>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>>>>>>>> <SNIP>>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>>>>>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>>>>>>>>>>>> So far so good, I suggest using the simplest version for v6:>>>>>>>>>>>>>> static inline bool perfmon_capable(void)>>>>>>> {>>>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>>>>>>> }>>>>>>>>>>>>>> It keeps the implementation simple and readable. The implementation is more>>>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>>>>>> privileged process.>>>>>>>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>>>>>> based approach to use perf_event_open system call.>>>>>>>>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.>>>>>>>>>> I am trying to reproduce this double logging with CAP_PERFMON.>>>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.>>>>> When running perf stat -a I am observing this AVC audit messages:>>>>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>>>>>> However there is no capability related messages around. I suppose my refpolicy should>>>>> be modified somehow to observe capability related AVCs.>>>>>>>>>> Could you please comment or clarify on how to enable caps related AVCs in order>>>>> to test the concerned logging.>>>>>>>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.>>>>>> Yes, I already have it like this:>>> common cap2>>> {>>> <------>mac_override<--># unused by SELinux>>> <------>mac_admin>>> <------>syslog>>> <------>wake_alarm>>> <------>block_suspend>>> <------>audit_read>>> <------>perfmon>>> }>>>>>> dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials.>>> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it.>>>> Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module.> > Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that).It is user_t for local terminal session:ps -ZLABEL PID TTY TIME CMDuser_u:user_r:user_t 11317 pts/9 00:00:00 bashuser_u:user_r:user_t 11796 pts/9 00:00:00 psFor local terminal root session:ps -ZLABEL PID TTY TIME CMDuser_u:user_r:user_su_t 2926 pts/3 00:00:00 bashuser_u:user_r:user_su_t 10995 pts/3 00:00:00 psFor remote ssh session:ps -ZLABEL PID TTY TIME CMDuser_u:user_r:user_t 7540 pts/8 00:00:00 psuser_u:user_r:user_systemd_t 8875 pts/8 00:00:00 bash~Alexey

Stephen Smalley Feb. 12, 2020, 5:09 p.m. UTC | #18

On 2/12/20 11:56 AM, Alexey Budankov wrote:> > > On 12.02.2020 18:45, Stephen Smalley wrote:>> On 2/12/20 10:21 AM, Stephen Smalley wrote:>>> On 2/12/20 8:53 AM, Alexey Budankov wrote:>>>> On 12.02.2020 16:32, Stephen Smalley wrote:>>>>> On 2/12/20 3:53 AM, Alexey Budankov wrote:>>>>>> Hi Stephen,>>>>>>>>>>>> On 22.01.2020 17:07, Stephen Smalley wrote:>>>>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>>>>>>>>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>>>>>>>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>>>>>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>>>>>>>>>> <SNIP>>>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>>>>>>>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>>>>>>>>>>>>>> So far so good, I suggest using the simplest version for v6:>>>>>>>>>>>>>>>> static inline bool perfmon_capable(void)>>>>>>>> {>>>>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>>>>>>>> }>>>>>>>>>>>>>>>> It keeps the implementation simple and readable. The implementation is more>>>>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>>>>>>> privileged process.>>>>>>>>>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>>>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>>>>>>> based approach to use perf_event_open system call.>>>>>>>>>>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.>>>>>>>>>>>> I am trying to reproduce this double logging with CAP_PERFMON.>>>>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.>>>>>> When running perf stat -a I am observing this AVC audit messages:>>>>>>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>>>>>>>> However there is no capability related messages around. I suppose my refpolicy should>>>>>> be modified somehow to observe capability related AVCs.>>>>>>>>>>>> Could you please comment or clarify on how to enable caps related AVCs in order>>>>>> to test the concerned logging.>>>>>>>>>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.>>>>>>>> Yes, I already have it like this:>>>> common cap2>>>> {>>>> <------>mac_override<--># unused by SELinux>>>> <------>mac_admin>>>> <------>syslog>>>> <------>wake_alarm>>>> <------>block_suspend>>>> <------>audit_read>>>> <------>perfmon>>>> }>>>>>>>> dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials.>>>> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it.>>>>>> Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module.>>>> Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that).> > It is user_t for local terminal session:> ps -Z> LABEL PID TTY TIME CMD> user_u:user_r:user_t 11317 pts/9 00:00:00 bash> user_u:user_r:user_t 11796 pts/9 00:00:00 ps> > For local terminal root session:> ps -Z> LABEL PID TTY TIME CMD> user_u:user_r:user_su_t 2926 pts/3 00:00:00 bash> user_u:user_r:user_su_t 10995 pts/3 00:00:00 ps> > For remote ssh session:> ps -Z> LABEL PID TTY TIME CMD> user_u:user_r:user_t 7540 pts/8 00:00:00 ps> user_u:user_r:user_systemd_t 8875 pts/8 00:00:00 bashThat's a bug in either your policy or your userspace/distro integration. In any event, unless user_systemd_t is allowed all capability2 permissions by your policy, you should see the denials if CAP_PERFMON is set in the effective capability set of the process.

Alexey Budankov Feb. 13, 2020, 9:05 a.m. UTC | #19

On 12.02.2020 20:09, Stephen Smalley wrote:> On 2/12/20 11:56 AM, Alexey Budankov wrote:>>>>>> On 12.02.2020 18:45, Stephen Smalley wrote:>>> On 2/12/20 10:21 AM, Stephen Smalley wrote:>>>> On 2/12/20 8:53 AM, Alexey Budankov wrote:>>>>> On 12.02.2020 16:32, Stephen Smalley wrote:>>>>>> On 2/12/20 3:53 AM, Alexey Budankov wrote:>>>>>>> Hi Stephen,>>>>>>>>>>>>>> On 22.01.2020 17:07, Stephen Smalley wrote:>>>>>>>> On 1/22/20 5:45 AM, Alexey Budankov wrote:>>>>>>>>>>>>>>>>>> On 21.01.2020 21:27, Alexey Budankov wrote:>>>>>>>>>>>>>>>>>>>> On 21.01.2020 20:55, Alexei Starovoitov wrote:>>>>>>>>>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov>>>>>>>>>>> <alexey.budankov@linux.intel.com> wrote:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 21.01.2020 17:43, Stephen Smalley wrote:>>>>>>>>>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote:>>>>>>>>>>>>>>>>>>>>> <SNIP>>>>>>>>>>>>>>> Introduce CAP_PERFMON capability designed to secure system performance>>>>>>>>>>>>>>>>>>>>>>>>>> Why _noaudit()? Normally only used when a permission failure is non-fatal to the operation. Otherwise, we want the audit message.>>>>>>>>>>>>>>>>>> So far so good, I suggest using the simplest version for v6:>>>>>>>>>>>>>>>>>> static inline bool perfmon_capable(void)>>>>>>>>> {>>>>>>>>> return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);>>>>>>>>> }>>>>>>>>>>>>>>>>>> It keeps the implementation simple and readable. The implementation is more>>>>>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>>>>>>>> privileged process.>>>>>>>>>>>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>>>>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>>>>>>>> based approach to use perf_event_open system call.>>>>>>>>>>>>>>>> I can live with that. We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue. We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE.>>>>>>>>>>>>>> I am trying to reproduce this double logging with CAP_PERFMON.>>>>>>> I am using the refpolicy version with enabled perf_event tclass [1], in permissive mode.>>>>>>> When running perf stat -a I am observing this AVC audit messages:>>>>>>>>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { open } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { kernel } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>>> type=AVC msg=audit(1581496695.666:8691): avc: denied { cpu } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>>> type=AVC msg=audit(1581496695.666:8692): avc: denied { write } for pid=2779 comm="perf" scontext=user_u:user_r:user_systemd_t tcontext=user_u:user_r:user_systemd_t tclass=perf_event permissive=1>>>>>>>>>>>>>> However there is no capability related messages around. I suppose my refpolicy should>>>>>>> be modified somehow to observe capability related AVCs.>>>>>>>>>>>>>> Could you please comment or clarify on how to enable caps related AVCs in order>>>>>>> to test the concerned logging.>>>>>>>>>>>> The new perfmon permission has to be defined in your policy; you'll have a message in dmesg about "Permission perfmon in class capability2 not defined in policy.". You can either add it to the common cap2 definition in refpolicy/policy/flask/access_vectors and rebuild your policy or extract your base module as CIL, add it there, and insert the updated module.>>>>>>>>>> Yes, I already have it like this:>>>>> common cap2>>>>> {>>>>> <------>mac_override<--># unused by SELinux>>>>> <------>mac_admin>>>>> <------>syslog>>>>> <------>wake_alarm>>>>> <------>block_suspend>>>>> <------>audit_read>>>>> <------>perfmon>>>>> }>>>>>>>>>> dmesg stopped reporting perfmon as not defined but audit.log still doesn't report CAP_PERFMON denials.>>>>> BTW, audit even doesn't report CAP_SYS_ADMIN denials, however perfmon_capable() does check for it.>>>>>>>> Some denials may be silenced by dontaudit rules; semodule -DB will strip those and semodule -B will restore them. Other possibility is that the process doesn't have CAP_PERFMON in its effective set and therefore never reaches SELinux at all; denied first by the capability module.>>>>>> Also, the fact that your denials are showing up in user_systemd_t suggests that something is off in your policy or userspace/distro; I assume that is a domain type for the systemd --user instance, but your shell and commands shouldn't be running in that domain (user_t would be more appropriate for that).>>>> It is user_t for local terminal session:>> ps -Z>> LABEL PID TTY TIME CMD>> user_u:user_r:user_t 11317 pts/9 00:00:00 bash>> user_u:user_r:user_t 11796 pts/9 00:00:00 ps>>>> For local terminal root session:>> ps -Z>> LABEL PID TTY TIME CMD>> user_u:user_r:user_su_t 2926 pts/3 00:00:00 bash>> user_u:user_r:user_su_t 10995 pts/3 00:00:00 ps>>>> For remote ssh session:>> ps -Z>> LABEL PID TTY TIME CMD>> user_u:user_r:user_t 7540 pts/8 00:00:00 ps>> user_u:user_r:user_systemd_t 8875 pts/8 00:00:00 bash> > That's a bug in either your policy or your userspace/distro integration. In any event, unless user_systemd_t is allowed all capability2 permissions by your policy, you should see the denials if CAP_PERFMON is set in the effective capability set of the process.> That all seems to be true. After instrumentation, rebuilding and rebooting, in CAP_PERFMON case:$ getcap perfperf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep$ perf stat -atype=AVC msg=audit(1581580399.165:784): avc: denied { open } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580399.165:785): avc: denied { perfmon } for pid=8859 comm="perf" capability=38 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability2 permissive=1type=AVC msg=audit(1581580399.165:786): avc: denied { kernel } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580399.165:787): avc: denied { cpu } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580399.165:788): avc: denied { write } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580408.078:791): avc: denied { read } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1dmesg:[ 137.877713] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = ?[ 137.877774] cread_has_capability(CAP_PERFMON) = 0[ 137.877775] prior avc_audit(CAP_PERFMON)[ 137.877779] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = 0[ 137.877784] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = ?[ 137.877785] cread_has_capability(CAP_PERFMON) = 0[ 137.877786] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = 0[ 137.877794] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = ?[ 137.877795] cread_has_capability(CAP_PERFMON) = 0[ 137.877796] security_capable(0000000071f7ee6e, 000000009dd7a5fc, CAP_PERFMON, 0) = 0...in CAP_SYS_ADMIN case:$ getcap perfperf = cap_sys_ptrace,cap_sys_admin,cap_syslog+ep$ perf stat -atype=AVC msg=audit(1581580747.928:835): avc: denied { open } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580747.928:836): avc: denied { cpu } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580747.928:837): avc: denied { kernel } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580747.928:838): avc: denied { read } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580747.928:839): avc: denied { write } for pid=8927 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1...$ perf record -- ls...type=AVC msg=audit(1581580747.930:843): avc: denied { sys_ptrace } for pid=8927 comm="perf" capability=19 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability permissive=1...dmesg:[ 276.714266] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = ?[ 276.714268] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = -1[ 276.714269] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?[ 276.714270] cread_has_capability(CAP_SYS_ADMIN) = 0[ 276.714270] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = 0[ 276.714287] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = ?[ 276.714287] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = -1[ 276.714288] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?[ 276.714288] cread_has_capability(CAP_SYS_ADMIN) = 0[ 276.714289] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = 0[ 276.714294] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = ?[ 276.714295] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_PERFMON, 0) = -1[ 276.714295] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?[ 276.714296] cread_has_capability(CAP_SYS_ADMIN) = 0[ 276.714296] security_capable(000000006b09ad8a, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = 0...in unprivileged case:$ getcap perfperf =$ perf stat -a; perf record -a...dmesg:[ 947.275611] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = ?[ 947.275613] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = -1[ 947.275614] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?[ 947.275615] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = -1[ 947.275636] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = ?[ 947.275637] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_PERFMON, 0) = -1[ 947.275638] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = ?[ 947.275638] security_capable(00000000d3a75377, 000000009dd7a5fc, CAP_SYS_ADMIN, 0) = -1...So it looks like CAP_PERFMON and CAP_SYS_ADMIN are not ever logged by AVC simultaneously,in the current LSM and perfmon_capable() implementations.If perfmon is granted:perfmon is not logged by capabilities, perfmon is logged by AVC,no check for sys_admin by perfmon_capable().If perfmon is not granted but sys_admin is granted:perfmon is not logged by capabilities, AVC logging is not called for perfmon,sys_admin is not logged by capabilities, sys_admin is not logged by AVC, for some intended reason?No caps are granted:AVC logging is not called either for perfmon or for sys_admin.BTW, is there a way to may be drop some AV cache so denials would appear in audit in the next AV access?Well, I guess you have initially mentioned some case similar to this (note that ids are not the same but pids= are):type=AVC msg=audit(1581580399.165:784): avc: denied { open } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580399.165:785): avc: denied { perfmon } for pid=8859 comm="perf" capability=38 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability2 permissive=1type=AVC msg=audit( . : ): avc: denied { sys_admin } for pid=8859 comm="perf" capability=21 scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=capability2 permissive=1type=AVC msg=audit(1581580399.165:786): avc: denied { kernel } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580399.165:787): avc: denied { cpu } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580399.165:788): avc: denied { write } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1type=AVC msg=audit(1581580408.078:791): avc: denied { read } for pid=8859 comm="perf" scontext=user_u:user_r:user_t tcontext=user_u:user_r:user_t tclass=perf_event permissive=1So the message could be like this:"If audit logs for a process using perf_events related syscalls i.e. perf_event_open(), read(), write(), ioctl(), mmap() contain denials both for CAP_PERFMON and CAP_SYS_ADMIN capabilities then providing the process with CAP_PERFMON capability singly is the secure preferred approach to resolve access denials to performance monitoring and observability operations."~Alexey

Alexey Budankov Feb. 20, 2020, 1:05 p.m. UTC | #20

On 07.02.2020 16:39, Alexey Budankov wrote:> > On 07.02.2020 14:38, Thomas Gleixner wrote:>> Alexey Budankov <alexey.budankov@linux.intel.com> writes:>>> On 22.01.2020 17:25, Alexey Budankov wrote:>>>> On 22.01.2020 17:07, Stephen Smalley wrote:>>>>>> It keeps the implementation simple and readable. The implementation is more>>>>>> performant in the sense of calling the API - one capable() call for CAP_PERFMON>>>>>> privileged process.>>>>>>>>>>>> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes,>>>>>> but this bloating also advertises and leverages using more secure CAP_PERFMON>>>>>> based approach to use perf_event_open system call.>>>>>>>>>> I can live with that. We just need to document that when you see>>>>> both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process,>>>>> try only allowing CAP_PERFMON first and see if that resolves the>>>>> issue. We have a similar issue with CAP_DAC_READ_SEARCH versus>>>>> CAP_DAC_OVERRIDE.>>>>>>>> perf security [1] document can be updated, at least, to align and document >>>> this audit logging specifics.>>>>>> And I plan to update the document right after this patch set is accepted.>>> Feel free to let me know of the places in the kernel docs that also>>> require update w.r.t CAP_PERFMON extension.>>>> The documentation update wants be part of the patch set and not planned>> to be done _after_ the patch set is merged.> > Well, accepted. It is going to make patches #11 and beyond.Patches #11 and #12 of v7 [1] contain information on CAP_PERFMON intention and usage.Patch for man-pages [2] extends perf_event_open.2 documentation.Thanks,Alexey---[1] https://lore.kernel.org/lkml/c8de937a-0b3a-7147-f5ef-69f467e87a13@linux.intel.com/[2] https://lore.kernel.org/lkml/18d1083d-efe5-f5f8-c531-d142c0e5c1a8@linux.intel.com/

diff mbox series

Patch

diff --git a/include/linux/capability.h b/include/linux/capability.hindex ecce0f43c73a..8784969d91e1 100644--- a/include/linux/capability.h+++ b/include/linux/capability.h@@ -251,6 +251,18 @@  extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);+static inline bool perfmon_capable(void)+{+struct user_namespace *ns = &init_user_ns;++if (ns_capable_noaudit(ns, CAP_PERFMON))+return ns_capable(ns, CAP_PERFMON);++if (ns_capable_noaudit(ns, CAP_SYS_ADMIN))+return ns_capable(ns, CAP_SYS_ADMIN);++return false;+} /* audit system wants to get cap info from files as well */ extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps);diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.hindex 240fdb9a60f6..8b416e5f3afa 100644--- a/include/uapi/linux/capability.h+++ b/include/uapi/linux/capability.h@@ -366,8 +366,14 @@  struct vfs_ns_cap_data { #define CAP_AUDIT_READ37 +/*+ * Allow system performance and observability privileged operations+ * using perf_events, i915_perf and other kernel subsystems+ */++#define CAP_PERFMON38 -#define CAP_LAST_CAP CAP_AUDIT_READ+#define CAP_LAST_CAP CAP_PERFMON #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP) diff --git a/security/selinux/include/classmap.h b/security/selinux/include/classmap.hindex 7db24855e12d..c599b0c2b0e7 100644--- a/security/selinux/include/classmap.h+++ b/security/selinux/include/classmap.h@@ -27,9 +27,9 @@  "audit_control", "setfcap" #define COMMON_CAP2_PERMS "mac_override", "mac_admin", "syslog", \-"wake_alarm", "block_suspend", "audit_read"+"wake_alarm", "block_suspend", "audit_read", "perfmon" -#if CAP_LAST_CAP > CAP_AUDIT_READ+#if CAP_LAST_CAP > CAP_PERFMON #error New capability defined, please update COMMON_CAP2_PERMS. #endif