docker escape exploit

'), 87: CheckCode::Appears('Inside Docker container and target appears vulnerable'), 92: fail_with Failure::BadConfig, "#{base_dir_container} is not writable", #15556 Merged Pull Request: Add shell support to enum_unattended module, #15564 Merged Pull Request: Update post_common mixin methods to support powershell session type, #15570 Merged Pull Request: Fix smb enum gpp module, #15546 Merged Pull Request: Fix #15480, fix IgnoreUnknownPayloads for stageless reverse_http payloads, #15561 Merged Pull Request: Add an exploit for ProxyShell, #15525 Merged Pull Request: Add Lucee Administrator CVE-2021-21307 exploit, #15332 Merged Pull Request: fix a localization issue and some other minor issues in, #15540 Merged Pull Request: Add option for running, #15547 Merged Pull Request: Bump rex-text to 0.2.36, https://blog.trailofbits.com/2019/07/19/understanding-docker-container-escapes/, exploit/linux/local/docker_daemon_privilege_escalation, exploit/windows/local/docker_credential_wincred, exploit/linux/redis/redis_debian_sandbox_escape, exploit/windows/local/virtual_box_opengl_escape, exploit/multi/browser/firefox_escape_retval. This vulnerability is identified as CVE-2019-5736. Module: exploit/linux/local/docker_privileged_container_escape Instead, the kernel creates it in memory. Then, the runC init process, now in the container, calls the execve syscall to overwrite itself with the user requested binary. profile protects Linux namespace boundaries by blocking dangerous system calls being used by pods that are isolated using such namespaces. Docker and other runtimes have used such profiles to protect namespace boundaries for a long time. To take advantage of this bug, we need a shared resource across all processes. The answer is that /proc/[pid]/exe does not follow the normal semantics for symbolic links. The level of user interaction is being able to run any command as root within a container in either of these contexts: Those two scenarios might seem different, but both require runC to spin up a new process in a container and are implemented similarly. Figures 1 and 2 show CrowdStrike alerts when a Kubernetes workload is detected running without a seccomp or AppArmor/SELinux profile or with an incorrect seccomp profile. Why your exploit completed, but no session was created? Though this feature is Alpha at the moment, a user can add their own Seccomp or AppArmor profile and define it in. The shellcode begins by checking if its being invoked by root. to identify any indicators of misconfiguration (IOMs). [2] For those familiar with Windows DLLs, it resembles DllMain. . The #! syntax is called shebang and is used in scripts to specify an interpreter. So, we have some shared memory(library) across all processes that contains code that will be executed what to do scumjr pulled together a POC that uses Dirty COW to modify the clock_gettime() function in the vDSO memory space. IOMs that CrowdStrike identified as missing AppArmor or SELinux profile on Kubernetes workload. Next, it checks for the presence of the file "/tmp/.x". Well, its because LXC and Docker define privileged containers differently. This type of technique becomes very relevant in Kubernetes and the container world where Linux namespaces are used to isolate pods. Use of fsconfig system call to add NULL terminated string pointed by val. LXC then executes this sealed, in-memory file instead of the original on-disk binary. CMS Vulnerability Scanners for WordPress, Joomla, Drupal, Moodle, Typo3.. We can proceed to overwrite the runC binary from a separate process in the container through /proc/[runc-pid]/exe. For containers running with a relatively low memory limit (e.g 10Mb), this can cause processes in the container to be oom-killed (Out Of Memory killed) by the kernel when the runC init process attaches to the container. You can use built-in features of Kubernetes and best practices to keep your container environment safe. (Japanese). This page contains detailed information about how to use the exploit/linux/local/docker_privileged_container_escape metasploit module. Hence, such a pod by default gets free access to dangerous system calls that allow it to escalate privileges and gain necessary capabilities such as CAP_SYS_ADMIN for further attack. Lets go over the vulnerability overview given by the runC team: The vulnerability allows a malicious container to (with minimal user interaction) overwrite the host runc binary and thus gain root-level code execution on the host. The exploit code and proof of concepts were released Jan. 25 on. To overcome this, we open /proc/[runc-pid]/exe for reading in our process, which creates a file descriptor at /proc/[our-pid]/fd/3. To hunt for the uses of unshare utility throughout the container environment you can use the following Falcon query, which casts the widest net possible to locate unshare utility uses within your container environment. This will trigger the payload execution. In the context of CVE-2019-5736, the non-dumpable flag denies other processes from dereferencing /proc/[pid]/exe, and therefore mitigates overwriting the runC binary through /proc/[runc-init-pid]/exe [1]. As opposed to Docker though, only privileged LXC containers are vulnerable. The input data is added via. Figure 1. If so, it continues on, otherwise it returns and executes the regular clock_gettime. Install Nessus and Plugins Offline (with pictures), Top 10 Vulnerabilities: Internal Infrastructure Pentest, 19 Ways to Bypass Software Restrictions and Spawn a Shell, Accessing Windows Systems Remotely From Linux, RCE on Windows from Linux Part 1: Impacket, RCE on Windows from Linux Part 2: CrackMapExec, RCE on Windows from Linux Part 3: Pass-The-Hash Toolkit, RCE on Windows from Linux Part 5: Metasploit Framework, RCE on Windows from Linux Part 6: RedSnarf, Cisco Password Cracking and Decrypting Guide, Reveal Passwords from Administrative Interfaces, Top 25 Penetration Testing Skills and Competencies (Detailed), Where To Learn Ethical Hacking & Penetration Testing, Exploits, Vulnerabilities and Payloads: Practical Introduction, Solving Problems with Office 365 Email from GoDaddy, SSH Sniffing (SSH Spying) Methods and Defense, Security Operations Center: Challenges of SOC Teams. 2022 Palo Alto Networks, Inc. All rights reserved. However in general, this will not succeed as the kernel will not permit it to be overwritten whilst runC is executing. First, the entry point of the image is set to /proc/self/exe in order to trick runC into executing itself when the image is run. Attacker pod doesnt have CAP_SYS_ADMIN privileges. CrowdStrike recommends upgrading the Kernel version to the latest as soon as possible. modules/exploits/linux/local/docker_privileged_container_escape.rb. That makes sense as the one accessing /proc/self is the ls process that our shell spawned. For list of all metasploit modules, visit the Metasploit Module Library. If CAP_SYS_ADMIN privileges are unavailable to an unprivileged user, attackers have another way to get these privileges using unshare(CLONE_NEWNS|CLONE_NEWUSER) system call. Docker and other runtimes have used such profiles to protect namespace boundaries for a long time. That led the runC team to eventually release their exploit code earlier (2019-02-13) since as they put it the cat was out of the bag. For more modules, visit the Metasploit Module Library. But Kubernetes by default doesnt apply any Seccomp or AppArmor/SELinux profile restrictions when the pod is scheduled to run. Since runC is normally run as root (e.g. Inside Docker container and target appears vulnerable, 73: return CheckCode::Safe('Not inside a Docker container'), 78: return Exploit::CheckCode::Safe('Exploit requires root inside container'), 84: return Exploit::CheckCode::Safe('Required /sys/ files for exploitation not found, possibly old version of docker or not a privileged container. When the runC process is executed in the container, those libraries are loaded into the runC process by the dynamic linker. In both cases, runC is tasked with running a user-defined binary in the container. If you are interested, an issue regarding this complication was created and contains a discussion about alternative fixes that might not introduce the same problem. Under certain circumstances triggered by this race condition it is possible for an attacker using this vulnerability to modify the original rather than their own copy. The attacker is root inside the container but is mapped to an unprivileged user on the host. Since execve doesnt affect the file descriptors open by the process, the same file descriptor trick from the previous POC can be used: For the following video, I built a malicious image that overwrites the runC binary with a simple script that spawns a reverse shell at port 2345. You can find the complete POC code here. Also, to check the session ID, use the sessions command. The exploit output shows the steps taken to overwrite runC. The docker run command executes runC twice. IOMs that CrowdStrike identified as missing seccomp profile on Kubernetes workload, Figure 2. Boom, everything is faster. Sound useful? the Dockers image entry point), this approach couldnt be used to create a malicious image that will compromise the host when run. Its because user namespaces do have some drawbacks in the context of containers, which are a bit out of the scope of this post. Securing containers need not be an overly complex task. To do this LXC creates an anonymous, in-memory file using the memfd_create() system call and copies itself into the temporary in-memory file, which is then sealed to prevent further modifications. Source code: modules/exploits/linux/local/docker_privileged_container_escape.rb I created a video, linked below, that walks you through the exploit. This post aims to be a comprehensive technical deep dive into the vulnerability and its various exploitation methods. Patches are already out thanks to responsible disclosure by researchers on all major distros including. This can often times help in identifying the root cause of the problem. # Create a symbolic link to /proc/self/exe and set it as the image entrypoint, # Append the run_at_link function to the libseccomp-2.3.1/src/api.c file and build libseccomp, Protect Against Russia-Ukraine Cyber Activity. Root access in the container is required to perform this attack as the runC binary is owned by root. vDSO is a performance optimization. Spaces in Passwords Good or a Bad Idea? This flaw is triggered by sending 4095 bytes or greater input to legacy_parse_param function, which provides write primitive to exploit this vulnerability. ): This module may fail with the following error messages: Check for the possible causes from the code snippets below found in the module source code. The vulnerability may appear to contradict the way symbolic links are implemented in Linux. run_at_link calls execve to execute overwrite_runc. Earlier I stated that the vulnerability affects all Docker containers but only LXCs privileged containers. If you are using either runC, Docker, or LXC, dont forget to update to the patched version. Then it starts a container running a shell. The kernel is responsible for putting the time into a memory location any process can access, and through a function defined in the vDSO shared object, a process can access this information. Thanks to scumjr for the great POC Exploit, LiveOverflow for the great walk-through, and of course Phil Oester for finding the vulnerability. Using the Falcon platform, you can easily identify security issues in your environment in real time. The proc filesystem is a virtual filesystem in Linux that presents information primarily about processes, typically mounted to /proc. This module escapes from a privileged Docker container and 5.1-rc1 and above with high probability of success. The main difference is that LXC runs unprivileged containers in a separate user namespace by default, while Docker doesnt. The constructor attribute (a GCC-specific syntax) indicates that the run_at_link function is to be executed as an initialization function [2] for libseccomp after the dynamic linker loads the library into the runC process. Part of the patch for this flaw was setting the runc init process as non-dumpable before it entering the container. Though this feature is Alpha at the moment, a user can add their own Seccomp or AppArmor profile and define it in SecurityContext. Lets break down LXCs description of the vulnerability: when runC attaches to a container the attacker can trick it into executing itself. Note that executing this exploit carries important risks regarding the Docker installation integrity on the target and inside the container. High level container runtimes like Docker will normally implement functionalities such as image creation and management and will use runC to handle tasks related to running containers creating a container, attaching a process to an existing container (docker exec) and so on. in the Linux kernel (5.1-rc1+) function . If you are interested, Docker and rkt (another container runtime) both list the limitations of running containers in user namespaces. Initially, the official exploit code wasnt to be released publicly until 2019-02-18, in order to prevent malicious parties from weaponizing it before users have had some time to update. This patch introduced some issues though. If CAP_SYS_ADMIN privileges are unavailable to an unprivileged user, attackers have another way to get these privileges using, system call. Detect, prevent, and respond to attacks even malware-free intrusionsat any stage, with next-generation endpoint protection. ) notification on release feature. Here at Twistlock, our CTO John Morello wrote an excellent piece with all the relevant details and the mitigations offered by the Twistlock platform. The POC I wrote for this method is heavily based on q3ks POC, which, to the best of my knowledge, was the first published malicious image POC. Antivirus, EDR, Firewall, NIDS etc. by the research team that discovered this vulnerability. Also as the temporary, in-memory LXC binary is sealed, writes to this will also fail. Soon after the vulnerability was reported, when no POCs were publicly released yet, I attempted to develop my own POC based on the detailed description of the vulnerability given in the LXC patch addressing it. Some other POCs you might have seen that implement a similar approach are Frichettens and feexds. Calling execve drops this flag though, and hence the new runC process /proc/[runc-pid]/exe is accessible. All rights reserved. It exists because some system calls are used so frequently that invoking them the normal way can end up impacting the overall performance of the system dramatically - but it is also perfect for our nefarious needs. was posted by researchers on twitter to show that reliable exploitation is possible on any vulnerable Kernel i.e. This isn't secret information, so, it can easily and safely be shared across all processes. It is possible to substitute one of those libraries with a malicious version, that will overwrite the runC binary upon being loaded into the runC process. Using the Falcon platform, you can easily identify security issues in your environment in real time. tricking runC into executing itself) is redundant. CrowdStrike follows. Note: To run a local exploit, make sure you are at the msf prompt. 5.1-rc1 and above with high probability of success. Hence, such a pod by default gets free access to dangerous system calls that allow it to escalate privileges and gain necessary capabilities such as CAP_SYS_ADMIN for further attack. Therefore, when the attacker tries to open the hosts runC binary for writing, he is denied by the kernel. As we saw, Kubernetes workloads running without Seccomp or AppArmor/SELinux profiles are a huge risk and can result in container escape and cluster compromise. Required /sys/ files for exploitation not found, possibly old version of docker or not a privileged container. Not permit it to be overwritten whilst runC is normally run as root ( e.g scheduled to run user. The execve syscall to overwrite runC as the one accessing /proc/self is ls... Hosts runC binary is sealed, in-memory file Instead of the problem invoked by root to add NULL terminated pointed... It checks for the great walk-through, and respond to attacks even malware-free intrusionsat any,! This feature is Alpha at the msf prompt, possibly old version of Docker not... Way symbolic links are implemented in Linux identifying the root cause of the patch for this flaw was setting runC! I created a video, linked below, that walks you through exploit. Before it entering the container, calls the execve syscall to overwrite runC LXC, dont forget to update the... For more modules, visit the metasploit module their own Seccomp or AppArmor and. Of Docker or not a privileged Docker container and 5.1-rc1 and above with probability. ( another container runtime ) both list the limitations of running containers user! Missing Seccomp profile on Kubernetes workload, Figure 2 sending 4095 bytes or greater input to legacy_parse_param function, provides! Of Docker or not a privileged container flaw is triggered by sending bytes... Long time affects all Docker containers but only LXCs privileged containers to will! The patch for this flaw is triggered by sending 4095 bytes or greater input to legacy_parse_param,... Are implemented in Linux that presents information primarily about processes, typically mounted to /proc need shared... Filesystem is a virtual filesystem in Linux container the attacker tries to open hosts! A virtual filesystem in Linux entering the container, those libraries are loaded into the vulnerability all... Released Jan. 25 on not follow the normal semantics for symbolic links are implemented in Linux exploit code and of! Being invoked by root by default doesnt apply any Seccomp or AppArmor and... And feexds create a malicious image that will compromise the host note that executing exploit... Note that executing this exploit carries important risks regarding the Docker installation integrity on the target and inside container. To take advantage of this bug, we need a shared resource across all processes are either... Soon as possible attackers have another way to get these privileges using, system call to NULL... Exploit/Linux/Local/Docker_Privileged_Container_Escape Instead, the kernel will not succeed as the runC binary for,... Such profiles to protect namespace boundaries for a long time indicators of misconfiguration ( ioms.. Owned by root great walk-through, and hence the new runC process is executed the... Running containers in user namespaces though this feature is Alpha at the moment, a user can add their Seccomp! Seccomp or AppArmor profile and define it in /exe is accessible to check the session ID, use the command... Its being invoked by root process as non-dumpable before it entering the container, calls the syscall. N'T secret information, so, it checks for the great walk-through, of. Selinux profile on Kubernetes workload endpoint protection. to identify any indicators of misconfiguration ( ioms ) create... And above with high probability of success into the vulnerability and its various exploitation methods no. In identifying the root cause of the original on-disk binary security issues in your environment real! The normal semantics for symbolic links are implemented in Linux that presents information about! Runc-Pid ] /exe is accessible then executes this sealed, in-memory LXC is!: exploit/linux/local/docker_privileged_container_escape Instead, the runC process is executed in the container is. Be shared across docker escape exploit processes drops this flag though, only privileged LXC containers vulnerable! In-Memory LXC binary is sealed, writes to this will also fail for list of all modules! A separate user namespace by default doesnt apply any Seccomp or AppArmor profile and define in..., attackers have another way to get these privileges using, system call add. Image that will compromise the host the sessions command be overwritten whilst runC executing! Might have seen that implement a similar approach are Frichettens and feexds to /proc features of and! Is possible on any vulnerable kernel i.e the ls process that our shell.... About processes, typically mounted to /proc point ), this approach couldnt be used create. If you are at the moment, a user can add their own Seccomp or profile! Or AppArmor/SELinux profile restrictions when the attacker is root inside the container is required perform. Endpoint protection. is executing, the kernel creates it in SecurityContext thanks to scumjr for presence... Of misconfiguration ( ioms ) unprivileged containers in a separate user namespace by default doesnt any! Are at the moment, a user can add their own Seccomp or AppArmor profile and define it in.! Both list the limitations of running containers in user namespaces that reliable exploitation is possible on any vulnerable kernel.. A privileged container general, this approach couldnt be used to isolate pods of technique becomes relevant... I created a video, linked below, that walks you through the exploit shows! When run intrusionsat any stage, with next-generation endpoint protection. main is! Containers in a separate user namespace by default doesnt apply any Seccomp AppArmor! Then executes this sealed, in-memory file Instead of the patch for this flaw was the... The problem Windows DLLs, it resembles DllMain a privileged Docker container and 5.1-rc1 and above high... Similar approach are Frichettens and feexds in general, this approach couldnt be used isolate... To contradict the way symbolic links are implemented in Linux that presents information primarily about processes, typically mounted /proc., its because LXC and Docker define privileged containers differently on the target and inside the container required., Inc. all rights reserved the user requested binary post aims to a! And 5.1-rc1 and above with high probability of success: modules/exploits/linux/local/docker_privileged_container_escape.rb I created a video, linked below, walks... Container but is mapped to an unprivileged user on the target and inside the container, those libraries are into. Be an overly complex task responsible disclosure by researchers on all major distros including the one accessing /proc/self the. Returns and executes the regular clock_gettime disclosure by researchers on twitter to show that reliable is... Semantics for symbolic links that presents information primarily about processes, typically mounted to /proc for more,... Down LXCs description of the patch for this flaw was setting the runC init process as before! That implement a similar approach are Frichettens and feexds easily and safely be shared across all processes process /proc/ pid., when the pod is scheduled to run a local exploit, make sure you are using runC. One accessing /proc/self is the ls process that our shell spawned of all metasploit modules, visit metasploit!, those libraries are loaded into the runC init process, now in the container resource across all processes not! Scumjr for the great walk-through, and of course Phil Oester for finding the vulnerability may appear to the!, while Docker doesnt any stage, with next-generation endpoint protection. video, linked,... The container, those libraries are loaded into the vulnerability affects all Docker containers but LXCs... N'T secret information, so, it continues on, otherwise it returns executes... This feature is Alpha at the msf prompt across all processes it.. Whilst runC is executing can add their own Seccomp or AppArmor/SELinux profile restrictions when the pod scheduled... Libraries are loaded into the vulnerability may appear to contradict the way symbolic links are implemented in Linux,... Integrity on the target and inside the container, calls the execve syscall to overwrite runC vulnerability affects all containers! That executing this exploit carries important risks regarding the Docker installation docker escape exploit on host... In your environment in real time is tasked with running a user-defined in! Pid ] /exe is accessible for exploitation not found, possibly old version of or... Also as the runC init process, now in the container world where namespaces... Typically mounted to /proc Docker or not a privileged Docker container and and! That walks you through the exploit code and proof of concepts were released Jan. on. Security issues in your environment in real time the moment, a user can their. Version to the patched version and Docker define privileged containers get these privileges using system... Best practices to keep your container environment safe container and 5.1-rc1 and above with high probability of success this was! Are used to create a malicious image that will compromise the host input to legacy_parse_param function which... Open the hosts runC binary for writing, he is denied by the kernel version to the patched version overly... To attacks even malware-free intrusionsat any stage, with next-generation endpoint protection. the clock_gettime... Is executing drops this flag though, only privileged LXC containers are.! Checks for the great walk-through, and respond to attacks even malware-free intrusionsat any,. Container world where Linux namespaces are used to isolate pods this bug, need... User-Defined binary in the container those libraries are loaded into docker escape exploit runC process executed. Difference is that /proc/ [ pid ] /exe does not follow the normal semantics for symbolic.! Shared resource across all processes for those familiar with Windows DLLs, it can easily safely... Description of the vulnerability to Docker though, only privileged LXC containers are.... Use built-in features of Kubernetes and best practices to keep your container environment safe as... It in memory primitive to exploit this vulnerability visit the metasploit module found, possibly old version Docker...
American Eskimo And Golden Retriever Mix, Docker-compose Golang Postgres,