eBPF for Cybersecurity - Part 1
Table of contents
What is eBPF ?
born out of a need for a better Linux tracing tool. first released in a limited capacity in 2014 with Linux 3.18, making full use of eBPF at least Linux 4.4 or above
eBPF can run sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules
eBPf is a mechanism for Linux applications to execute code in Linux Kernal space eBPF has been used to create programs for networking, debugging , tracing, firewalls and more
to understand in more detail starting with Linux and it divides its memory into areas.
kernel space - in simple words to understand it, kernel space is where the core of the operating system resides and has all unrestricted access to all hardware - memory, storage, CPU, etc . due to the privileged nature of the kernel itself.
kernel space is protected and allowed to run only trusted code which is kernel code and device drivers.
User space - user space is where anything is not a kernel process run e.g regular applications, user space code has limited access to hardware and replies on code running in kernel space for privileged operations such as disk or network or any I/O . this happens via kernel API referred to as "system calls"
while the system calls interface us sufficiently in most cases and developers need to add support to new hardware, implement new filesystems or even custom calls to make this possible for programmers to extend the base kernel without adding directly to kernel source code. Linux Kernel Modules (LKMs) serve this function.
LKM ( Linux Kernel Modules) are loaded directly kernel. it can load at runtime, removing the need to recompile the entire kernel and reboot the machine each time a new kernel module is required.
LKM is helpful but introduced risk to the system. Indeed also separations between Kernel and user space add several import security measures to the OS kernel. Kernel Services Connet user space to physical hardware.
LKMs can make the kernel Crash and kernel version upgradation can add more radius of security vulnerabilities. it's hard for maintainers too!
What does eBPF do?
eBPF (Extended Berkeley Packet Filter) is a technology that makes it possible to run special programs deep inside the Linux operating system in an isolated way.
as it filers data packets from the network and embedded into the kernel, the BPF provides
a network interface with a security layer that ensures the packet data is reliable and accessible using this approach teams can more easily collect the most important observability data from Linux applications and network resources.
Developed out of a need for improved Linux tracing tools, eBPF was influenced by dtrace tools available mainly for BSD and Solaris systems. unlike dtrace , Linux was not able to achieve a global overview of running systems rather it was restricted to specific frameworks for library calls functions and system calls.
Before being loaded into the kernel the eBPF program needs to pass a particular series of requirements. Verification includes executing the eBPF program in the virtual machine.
with 10,000+ lines of code, to carry out a set of checks the verified will go over the potential paths the eBPF program might take when executed in the kernel to ensure the program runs to completion without any looping which would result in a kernel lockup.
if all checks are cleared, the eBPF program is loaded and compiled into the kernel at a location in a code path and waits for the appropriate signal when the signal is received from an event the eBPF program loaded in the code path. once initiated the bytecode collects and executes information.
this way eBPF allows programmers to execute byte code safely within the Linux kernel without adding or changing kernel source code. It can't replace LKM altogether eBPF program introduces custom code that is related to protected hardware resources with a limited threat to the kernel.
eBPF programs are event-driven and are run when the kernel or an application passes a certain hook point. Pre-defined hooks include system calls, function entry/exit, kernel tracepoints, network events, and several others.
eBPF includes the following elements :
Predefined as eBPF is event-driven and its pass-through hook. Hooks are predefined and can include events like network events, system calls, function entry and exit kernel tracepoints. if there is no pre-defined hook for a certain requirement, you can create a user or kernel probe ( uprobe and kprobe)
Program verification - The eBPF system call can be used to load the eBPF program into the Linux kernel by using some eBPF library. when the program load into the kernel it has to verify to ensure it is safe to run
the program can only be loaded by a privileged eBPF process
the program won't crash or damage the system
It will not run in a loop. the program always runs to completion
eBPF maps - eBPF must be able to store its state and share collected data. user can access the eBPF map via system calls from both application and programs
Helper Calls - eBPF program needs to maintain its compatibility and avoid being bound to a specific Linux kernel. helper function are API provided by the kernel. helper calls allow programs to generate random numbers and receive time and date, access eBPF data, manipulate forwarding logic and network packets and more
Function and tail call - enabling function call to function to define called in the program. tails enable the execution of other eBPF programs
eBPF programs can be utilized for efficient networking, tracing and data profiling, observability, and security tooling, e.g., for threat defense and intrusion detection
why is this technology useful for security?
It extends visibility and control to all system calls as well as provides packet level visibility of all networking traffic in a singular system that doesn't have the performance implications of traditional security agents.
This allows for the following security use cases to be achieved easier than in the past:
1. Reduce alert fatigue - w/ additional insight from eBPF, teams can reduce alert fatigue by 97% with proper context, i.e. security observability
2. Security protection at point of attack
Why would a vendor implement eBPF?
1. Decouple security innovation from OS, while still allowing same deep insight as in-kernel tech
2. More system throughput
3. Consolidate sys call, network filtering & process context into single system
4. Limit overhead for observability
The fact that we can inspect packets gives us extremely performant observability tools that can be mapped to other aspects such as Kubernetes metadata and get in-depth security forensics from the extracted information. We can use the ability to drop or modify packets for network policies and do encryption with eBPF for security. Also, since we can send packets and change the destination for a packet, eBPF allows us to create powerful network functionalities, such as load balancing, routing and service mesh.
Did you find this article valuable?
Support CloudNativeFolks Community by becoming a sponsor. Any amount is appreciated!