This is a write-up about detecting exploitation of the Log4Shell vulnerability (CVE-2021-44228) in Log4j by monitoring specific syscalls using Falco. This post also describes the analysis I employed to arrive at my conclusions.

Note that this is not meant to be an end-all detection for Log4Shell but instead one of many that, as a whole, provide coverage across different points of visibility.

UPDATE (2022-06-20): This post outlines an alternative way to detect Log4shell, this time using Falco’s brand new byte matching feature.

Use of Weapons

Before we begin, a quick description of the main tools I used here:

  • Falco is a tool that “can detect and alert on any behavior that involves making system calls”. I think it’s one of, if not the best, open source threat detection tools one can have in Linux-land.
  • Sysdig is a syscall capture tool and shares DNA with Falco. The filters and outputs written in Sysdig is fully compatible with Falco rules. (How convenient!)

Summary

High quality detection is possible (YMMV) by watching the Java process’ write I/O buffer on network sockets, looking for patterns that indicate outbound connections via JNDI.

This has several advantages:

  • The detection is triggered on an established connection, so we can infer that the app is vulnerable, whether or not the subsequent RCE was successful.
  • By detecting the initial connection, we are not dependent on the various ways an attack may be carried out afterwards.
  • Assuming an organization’s reliance on Java JNDI (LDAP and/or RMI) is very minimal, there is high confidence that this detection will have a high true positive rate.

Falco rule

- macro: java_network_write
  condition: (evt.type in (write, sendto) and evt.dir=< and fd.type in (ipv4, ipv6) and proc.name=java)

- macro: jndi_ldap_indicator
  condition: (evt.buffer contains "2.16.840")

- macro: jndi_rmi_indicator
  condition: (evt.buffer startswith "JRMI")

- rule: Java Process JNDI Connection
  desc: Potential exploitation of the log4shell Log4j vulnerability (CVE-2021-44228)
  condition: >
    java_network_write and (jndi_ldap_indicator or jndi_rmi_indicator)    
  output: Java process JNDI connection (user=%user.name user_loginname=%user.loginname user_loginuid=%user.loginuid event=%evt.type connection=%fd.name server_ip=%fd.sip server_port=%fd.sport proto=%fd.l4proto process=%proc.name command=%proc.cmdline parent=%proc.pname buffer=%evt.buffer container_id=%container.id image=%container.image.repository)
  priority: CRITICAL
  tags: [mitre_initial_access]

This rule requires running Falco with the -A flag, which turns on all syscall monitoring. To run Falco normally, remove write from the java_network_write macro:

- macro: java_network_write
  condition: (evt.type in sendto and evt.dir=< and fd.type in (ipv4, ipv6) and proc.name=java)

Quick note about byte value matching

An upcoming version of Falco (possibly 0.32.0) adds support for matching byte values expressed as hex strings (see PR). With this feature we can write more robust rules by matching on the first few bytes, instead on relying solely on ASCII patterns.

Limitations

  • This detection only covers LDAP and RMI connections.
  • Java 17 uses the write() syscall, but Falco ignores it by default. There is an option that enables monitoring of all syscalls (-A flag), but could impact performance.
  • Because Falco captures only the first 80 bytes of the I/O buffer, this detection can be bypassed if the path of the callback URL exceeds 27 bytes. See this section for details and workarounds.

Analysis

Environment

Tools

  • Virtualbox VM with Ubuntu 18.04.6 LTS
  • Sysdig 0.27.1
  • Falco 0.30.0

Tested software

  • Java versions
    • OpenJDK Runtime Environment (build 17.0.1+12-Ubuntu-118.04)
    • OpenJDK Runtime Environment (build 11.0.13+8-Ubuntu-0ubuntu1.18.04)
    • OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~18.04-b07)
    • Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
    • Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
  • Vulnerable Log4j versions:
    • All versions from 2.0 up to 2.14.1

PoCs

All tested exploits operate as illustrated below (credits to Fastly):

The vulnerable apps ran exclusively as a separate user (“poc”) to make it easier to trace processes.

Discovery

I started with the vulnerable jar file and exploit server from d0nutptr’s PoC (log4j v2.14.1 running on Java 17). The exploit server is run as follows:

Note that the jndi:ldap server is listening on port 12345, and the HTTP server (with the malicious Java class) is on port 12346. It is also good to note the code for the malicious Java class (“foo”):

The vulnerable app (jar file) is then run from a script:

The string “Pwned 8)” is printed which indicates that the exploit was successful.

While this was happening, Sysdig was running on another terminal window. To make sure I didn’t miss anything, I first cast a wide net, i.e., I ran it such that it logged most syscalls from the vulnerable app:

Notable events

Looking at the resultant logs, four events stood out:

Event 1

The initial connection to the callback URL jndi:ldap://localhost:12345 (Phase 1 in the diagram), and the first few bytes that were sent by the app:

The strings foo (path in callback URL), objectClass0 and 2.16.840.1.113730.3.4.2 look interesting… but we’ll get back to that later.

Event 2

The subsequent HTTP call to the Python web server that was hosting the malicious class (Phase 2 in the diagram):

This is showing the actual HTTP GET request made by the app to the web server.

Event 3

The payload execution which wrote “Pwned 8)” to STDOUT:

Event 4

A final call was made to the original jndi:ldap callback URL (port 12345) which also includes the 2.16.840.1.113730.3.4.2 string seen during the initial connection. This was immediately followed by connection termination:

It is also important to note that this event only fires after the malicious class has been loaded and executed. If the class establishes a reverse shell, for example, then this event will not show up until after the reverse shell session has terminated.

Events analysis

This compares detection potential for the notable events from the previous section.

Event Possible signal for detection Potential advantages Potential disadvantages
1 write() data Low FP due to specific pattern match; great coverage because initial connection Matched pattern can be bypassed with long enough URL path unless I/O buffer is increased (see later section)
2 write() data Some visibility over the HTTP request High FP because Java commonly makes HTTP connections
3 Depends on exploit Post-exploit-specific detection Could be very hard to generalize; only good for catching known TTPs
4 write() data Low FP due to specific pattern match; good coverage because final connection; good backup to Event 1 due to similar pattern Not as reliable as Event 1 because there is a chance this doesn’t trigger (e.g. the running RCE process blocks it)

Given this, it made the most sense to obtain signals from Event 1, with Event 4 serving as fallback.

The most likely candidate for pattern match is the 2.16.840.1.113730.3.4.2 (full or partial) string. To further support this, it turns out that that number is the LDAP OID for ManageDsaIT Request Control.

Updated Sysdig query

Based on the prior findings, I updated the Sysdig command so that it captures only the relevant information:

sysdig -X "user.name=poc and evt.type in (write, sendto) and evt.dir=< and \
    fd.type in (ipv4, ipv6) and proc.name=java" \
    -p"%fd.name (%evt.type, %evt.buflen bytes) %evt.buffer"

Now is a good time to explain the syntax:

Statement Description
user.name=poc (Filter) Events from the “poc” user
evt.type in (write, sendto) (Filter) write() or sendto() syscalls
evt.dir=< (Filter) Exit events
fd.type in (ipv4, ipv6) (Filter) Network socket file descriptor type
proc.name=java (Filter) Java process
%fd.name (Output) File descriptor name. For a network socket, this contains the client IP and port (fd.cip, fd.cport) and server IP and port (fd.sip, fd.sport)
%evt.buffer (Output) Data being written to the destination. Note that this only shows the first 80 bytes by default.
%evt.buflen (Output) Actual length of the data buffer in bytes

For a complete list of fields, see this doc from falco.org.

Notice that the sendto() event was included here, too. This is explained in the next section.

Java 17 vs older versions

During the course of my testing, I learned that different Java versions use different syscalls when making JNDI connections. While Java 17 (the latest, default version in Ubuntu 18.04) makes write() calls, Java 6, 7, 8 and 11 all use sendto().

Java 17 looks like this:

while on Java 6, 7, 8 and 11:

The data is identical except for the syscall itself.

Effect of callback URL path length

The data we’ve seen so far was from using the callback URL localhost:12345/foo, and we can see the path in the Event 1 signal:

By default, Sysdig (and Falco) captures only the first 80 bytes of the I/O buffer. If the path is longer, we can see the subsequent bytes of data getting shifted back such that some end up beyond the capture boundary. For example, if the path is /foo456789/123456/8901234567, we can’t see the full 2.16.840.1.113730.3.4.2 pattern anymore:

In the case of the base64-encoded URLs such as described in this article, a path like /Basic/Command/Base64/d2dldCBoeHhwOi8vMTI3LjAuMC4xL2xoLnNoO2NobW9kICt4IGxoLnNoOy4vbGguc2gK would be long enough to completely fill up the buffer:

At this point we would have lost whatever pattern we were matching for, and the detection is effectively bypassed.

The only way to work around this is to start Sysdig and Falco with a larger I/O buffer capture size. Here’s the data when the size is 240 bytes (3x the default):

We get our signal back!

Increasing the buffer size could potentially impact the performance of systems. If we need to stick to defaults, there is fortunately a fallback even if long paths bypass the Event 1 signal: the Event 4 signal.

As far as I can tell, this value is consistent and is independent of any property of the callback URL. However, it does come with its own caveats as described in the Event 4 section.

Also note that the upcoming byte matching feature of Falco could potentially eliminate this issue.

What about RMI?

Up to this point I have only covered jndi:ldap connections. I did the same analysis for jndi:rmi, and it turns out that the matched pattern is straightforward and consistent:

So, for RMI, I think it should be enough to simply watch the pattern “JRMI”.

Testing all affected Log4j versions

The last thing I wanted to do was to verify that the signals are consistent across different vulnerable versions of Log4j and Java. Thanks to the log4jtest PoC I was able to write a quick and dirty script that can iterate through all combinations of Java and Log4j versions.

I installed Java 8, 11 and 17 as OpenJDK from the Ubuntu repository, while Java 6 and 7 were downloaded directly from Oracle. All Log4j libraries were downloaded from Apache. I then compiled different versions of the app, each one using each of the Java versions.

Test results below.

  • expected: the expected pattern was observed
  • incompatible: Log4j version is not compatible with the Java version
  • backport: Log4j version is a backport update for its major version (i.e. not vulnerable)
log4j version Java 17 (write) Java 11 (sendto) Java 8 (sendto) Java 7 (sendto) Java 6 (sendto)
2.0 expected expected expected expected expected
2.0.1 expected expected expected expected expected
2.0.2 expected expected expected expected expected
2.1 expected expected expected expected expected
2.2 expected expected expected expected expected
2.3 expected expected expected expected expected
2.3.1 backport backport backport backport backport
2.4 expected expected expected expected incompatible
2.4.1 expected expected expected expected incompatible
2.5 expected expected expected expected incompatible
2.6 expected expected expected expected incompatible
2.6.1 expected expected expected expected incompatible
2.6.2 expected expected expected expected incompatible
2.7 expected expected expected expected incompatible
2.8 expected expected expected expected incompatible
2.8.1 expected expected expected expected incompatible
2.8.2 expected expected expected expected incompatible
2.9.0 expected expected expected expected incompatible
2.9.1 expected expected expected expected incompatible
2.10.0 expected expected expected expected incompatible
2.11.0 expected expected expected expected incompatible
2.11.1 expected expected expected expected incompatible
2.11.2 expected expected expected expected incompatible
2.12.0 expected expected expected expected incompatible
2.12.1 expected expected expected expected incompatible
2.12.2 backport backport backport backport incompatible
2.12.3 backport backport backport backport incompatible
2.13.0 expected expected expected incompatible incompatible
2.13.1 expected expected expected incompatible incompatible
2.13.2 expected expected expected incompatible incompatible
2.13.3 expected expected expected incompatible incompatible
2.14.0 expected expected expected incompatible incompatible
2.14.1 expected expected expected incompatible incompatible

The results show that the pattern is consistent across the board. At this point I was confident enough to develop the appropriate Falco rule.

References

Acknowledgements

Big thanks to Andrew MacPherson and Nathanial Lattimer for providing me with PoCs and guidance. Also to Ian Carroll for feedback and ideas.