docs: Fix helpers and maps guide

This commit is contained in:
Pragyansh Chaturvedi
2026-01-29 02:54:46 +05:30
parent 581269e52b
commit aded125cba
2 changed files with 58 additions and 130 deletions

View File

@ -2,6 +2,11 @@
PythonBPF provides helper functions and utilities for BPF programs and userspace code. PythonBPF provides helper functions and utilities for BPF programs and userspace code.
```{note}
**Work in Progress:** PythonBPF is under active development. We are constantly adding support for more helpers, kfuncs, and map types. Check back for updates!
```
For comprehensive documentation on BPF helpers, see the [eBPF Helper Functions documentation on ebpf.io](https://ebpf.io/what-is-ebpf/#helper-calls).
## BPF Helper Functions ## BPF Helper Functions
BPF helper functions are kernel-provided functions that BPF programs can call to interact with the system. PythonBPF exposes these through the `pythonbpf.helper` module. BPF helper functions are kernel-provided functions that BPF programs can call to interact with the system. PythonBPF exposes these through the `pythonbpf.helper` module.
@ -16,6 +21,8 @@ from pythonbpf.helper import pid, ktime, comm
Get the current process ID. Get the current process ID.
> **Linux Kernel Helper:** `bpf_get_current_pid_tgid()`
```python ```python
from pythonbpf.helper import pid from pythonbpf.helper import pid
@ -24,39 +31,28 @@ from pythonbpf.helper import pid
def trace_open(ctx: c_void_p) -> c_int64: def trace_open(ctx: c_void_p) -> c_int64:
process_id = pid() process_id = pid()
print(f"Process {process_id} opened a file") print(f"Process {process_id} opened a file")
return c_int64(0) return 0
``` ```
**Returns:** `c_int32` - The process ID of the current task **Returns:** `c_int32` - The process ID of the current task
#### comm() #### comm()
Get the current process command name (up to 16 characters). Get the current process command name.
```python > **Linux Kernel Helper:** `bpf_get_current_comm()`
from pythonbpf.helper import comm
@bpf
@section("tracepoint/syscalls/sys_enter_execve")
def trace_exec(ctx: c_void_p) -> c_int64:
# comm requires a buffer to fill
process_name = str(16)
comm(process_name)
print(f"Executing: {process_name}")
return c_int64(0)
```
**Parameters:** **Parameters:**
* `buf` - Buffer to fill with the process command name * `buf` - Buffer to fill with the process command name
**Returns:** `c_int64` - 0 on success, negative on error **Returns:** `c_int64` - 0 on success, negative on error
**Note:** The buffer should be at least 16 bytes (TASK_COMM_LEN) to hold the full command name.
#### uid() #### uid()
Get the current user ID. Get the current user ID.
> **Linux Kernel Helper:** `bpf_get_current_uid_gid()`
```python ```python
from pythonbpf.helper import uid from pythonbpf.helper import uid
@ -66,7 +62,7 @@ def trace_open(ctx: c_void_p) -> c_int64:
user_id = uid() user_id = uid()
if user_id == 0: if user_id == 0:
print("Root user opened a file") print("Root user opened a file")
return c_int64(0) return 0
``` ```
**Returns:** `c_int32` - The user ID of the current task **Returns:** `c_int32` - The user ID of the current task
@ -77,6 +73,8 @@ def trace_open(ctx: c_void_p) -> c_int64:
Get the current kernel time in nanoseconds since system boot. Get the current kernel time in nanoseconds since system boot.
> **Linux Kernel Helper:** `bpf_ktime_get_ns()`
```python ```python
from pythonbpf.helper import ktime from pythonbpf.helper import ktime
@ -85,7 +83,7 @@ from pythonbpf.helper import ktime
def measure_latency(ctx: c_void_p) -> c_int64: def measure_latency(ctx: c_void_p) -> c_int64:
start_time = ktime() start_time = ktime()
# Store for later comparison # Store for later comparison
return c_int64(0) return 0
``` ```
**Returns:** `c_int64` - Current time in nanoseconds **Returns:** `c_int64` - Current time in nanoseconds
@ -102,6 +100,8 @@ def measure_latency(ctx: c_void_p) -> c_int64:
Get the ID of the CPU on which the BPF program is running. Get the ID of the CPU on which the BPF program is running.
> **Linux Kernel Helper:** `bpf_get_smp_processor_id()`
```python ```python
from pythonbpf.helper import smp_processor_id from pythonbpf.helper import smp_processor_id
@ -110,7 +110,7 @@ from pythonbpf.helper import smp_processor_id
def track_cpu(ctx: c_void_p) -> c_int64: def track_cpu(ctx: c_void_p) -> c_int64:
cpu = smp_processor_id() cpu = smp_processor_id()
print(f"Running on CPU {cpu}") print(f"Running on CPU {cpu}")
return c_int64(0) return 0
``` ```
**Returns:** `c_int32` - The current CPU ID **Returns:** `c_int32` - The current CPU ID
@ -126,19 +126,21 @@ def track_cpu(ctx: c_void_p) -> c_int64:
Safely read data from kernel memory. Safely read data from kernel memory.
> **Linux Kernel Helper:** `bpf_probe_read()`
```python ```python
from pythonbpf.helper import probe_read from pythonbpf.helper import probe_read
@bpf @bpf
def read_kernel_data(ctx: c_void_p) -> c_int64: def read_kernel_data(ctx: c_void_p) -> c_int64:
dst = c_uint64(0) dst = 0
size = 8 size = 8
src = c_void_p(...) # kernel address src = ctx # kernel address
result = probe_read(dst, size, src) result = probe_read(dst, size, src)
if result == 0: if result == 0:
print(f"Read value: {dst}") print(f"Read value: {dst}")
return c_int64(0) return 0
``` ```
**Parameters:** **Parameters:**
@ -154,19 +156,7 @@ def read_kernel_data(ctx: c_void_p) -> c_int64:
Safely read a null-terminated string from kernel memory. Safely read a null-terminated string from kernel memory.
```python > **Linux Kernel Helper:** `bpf_probe_read_str()`
from pythonbpf.helper import probe_read_str
@bpf
def read_filename(ctx: c_void_p) -> c_int64:
filename = str(256)
src = c_void_p(...) # pointer to filename in kernel
result = probe_read_str(filename, src)
if result > 0:
print(f"Filename: {filename}")
return c_int64(0)
```
**Parameters:** **Parameters:**
* `dst` - Destination buffer (string) * `dst` - Destination buffer (string)
@ -174,32 +164,14 @@ def read_filename(ctx: c_void_p) -> c_int64:
**Returns:** `c_int64` - Length of string on success, negative on error **Returns:** `c_int64` - Length of string on success, negative on error
#### deref()
Dereference a pointer safely.
```python
from pythonbpf.helper import deref
@bpf
def access_pointer(ctx: c_void_p) -> c_int64:
ptr = c_void_p(...)
value = deref(ptr)
print(f"Value at pointer: {value}")
return c_int64(0)
```
**Parameters:**
* `ptr` - Pointer to dereference
**Returns:** The dereferenced value or 0 if null
### Random Numbers ### Random Numbers
#### random() #### random()
Generate a pseudo-random 32-bit number. Generate a pseudo-random 32-bit number.
> **Linux Kernel Helper:** `bpf_get_prandom_u32()`
```python ```python
from pythonbpf.helper import random from pythonbpf.helper import random
@ -209,23 +181,19 @@ def sample_events(ctx: c_void_p) -> c_int64:
# Sample 1% of events # Sample 1% of events
if (random() % 100) == 0: if (random() % 100) == 0:
print("Sampled event") print("Sampled event")
return c_int64(0) return 0
``` ```
**Returns:** `c_int32` - A pseudo-random number **Returns:** `c_int32` - A pseudo-random number
**Use cases:**
* Event sampling
* Load shedding
* A/B testing
* Randomized algorithms
### Network Helpers ### Network Helpers
#### skb_store_bytes() #### skb_store_bytes()
Store bytes into a socket buffer (for network programs). Store bytes into a socket buffer (for network programs).
> **Linux Kernel Helper:** `bpf_skb_store_bytes()`
```python ```python
from pythonbpf.helper import skb_store_bytes from pythonbpf.helper import skb_store_bytes
@ -235,9 +203,9 @@ def modify_packet(ctx: c_void_p) -> c_int32:
offset = 14 # Skip Ethernet header offset = 14 # Skip Ethernet header
data = b"\x00\x01\x02\x03" data = b"\x00\x01\x02\x03"
size = len(data) size = len(data)
result = skb_store_bytes(offset, data, size) result = skb_store_bytes(offset, data, size)
return c_int32(0) return 0
``` ```
**Parameters:** **Parameters:**
@ -277,7 +245,7 @@ from ctypes import c_void_p, c_int64
@section("tracepoint/syscalls/sys_enter_execve") @section("tracepoint/syscalls/sys_enter_execve")
def trace_exec(ctx: c_void_p) -> c_int64: def trace_exec(ctx: c_void_p) -> c_int64:
print("Process started") # This goes to trace_pipe print("Process started") # This goes to trace_pipe
return c_int64(0) return 0
@bpf @bpf
@bpfglobal @bpfglobal
@ -336,7 +304,7 @@ from ctypes import c_void_p, c_int64
@section("tracepoint/syscalls/sys_enter_execve") @section("tracepoint/syscalls/sys_enter_execve")
def trace_exec(ctx: c_void_p) -> c_int64: def trace_exec(ctx: c_void_p) -> c_int64:
print(f"PID:{pid()}") print(f"PID:{pid()}")
return c_int64(0) return 0
@bpf @bpf
@bpfglobal @bpfglobal
@ -382,20 +350,20 @@ def read_start(ctx: c_void_p) -> c_int64:
process_id = pid() process_id = pid()
start = ktime() start = ktime()
start_times.update(process_id, start) start_times.update(process_id, start)
return c_int64(0) return 0
@bpf @bpf
@section("tracepoint/syscalls/sys_exit_read") @section("tracepoint/syscalls/sys_exit_read")
def read_end(ctx: c_void_p) -> c_int64: def read_end(ctx: c_void_p) -> c_int64:
process_id = pid() process_id = pid()
start = start_times.lookup(process_id) start = start_times.lookup(process_id)
if start: if start:
latency = ktime() - start latency = ktime() - start
print(f"Read latency: {latency} ns") print(f"Read latency: {latency} ns")
start_times.delete(process_id) start_times.delete(process_id)
return c_int64(0) return 0
@bpf @bpf
@bpfglobal @bpfglobal
@ -419,9 +387,9 @@ from ctypes import c_void_p, c_int64
def track_exec(ctx: c_void_p) -> c_int64: def track_exec(ctx: c_void_p) -> c_int64:
process_id = pid() process_id = pid()
user_id = uid() user_id = uid()
print(f"User {user_id} started process (PID: {process_id})") print(f"User {user_id} started process (PID: {process_id})")
return c_int64(0) return 0
@bpf @bpf
@bpfglobal @bpfglobal
@ -451,13 +419,13 @@ def cpu_counts() -> HashMap:
def count_switches(ctx: c_void_p) -> c_int64: def count_switches(ctx: c_void_p) -> c_int64:
cpu = smp_processor_id() cpu = smp_processor_id()
count = cpu_counts.lookup(cpu) count = cpu_counts.lookup(cpu)
if count: if count:
cpu_counts.update(cpu, count + 1) cpu_counts.update(cpu, count + 1)
else: else:
cpu_counts.update(cpu, c_uint64(1)) cpu_counts.update(cpu, 1)
return c_int64(0) return 0
@bpf @bpf
@bpfglobal @bpfglobal
@ -491,8 +459,8 @@ def sample_opens(ctx: c_void_p) -> c_int64:
if (random() % 100) < 5: if (random() % 100) < 5:
process_id = pid() process_id = pid()
print(f"Sampled: PID {process_id} opening file") print(f"Sampled: PID {process_id} opening file")
return c_int64(0) return 0
@bpf @bpf
@bpfglobal @bpfglobal
@ -504,56 +472,12 @@ b.load_and_attach()
trace_pipe() trace_pipe()
``` ```
## Best Practices
1. **Use appropriate helpers** - Choose the right helper for your use case
2. **Handle errors** - Check return values from helpers like `probe_read()`
3. **Minimize overhead** - Helper calls have cost; use judiciously
4. **Sample when appropriate** - Use `random()` for high-frequency events
5. **Clean up resources** - Delete map entries when done
## Common Patterns
### Store-and-Compare Pattern
```python
# Store a value
key = pid()
value = ktime()
my_map.update(key, value)
# Later: compare
stored = my_map.lookup(key)
if stored:
difference = ktime() - stored
```
### Filtering Pattern
```python
# Filter by user
user_id = uid()
if user_id == 0: # Only root
# Process event
pass
```
### Sampling Pattern
```python
# Sample 1 in N events
if (random() % N) == 0:
# Process sampled event
pass
```
## Troubleshooting ## Troubleshooting
### Helper Not Available ### Helper Not Available
If a helper function doesn't work: If a helper function doesn't work:
* Check your kernel version (some helpers are newer) * Check your kernel version (some helpers are newer)
* Verify the helper is available with `bpftool feature`
* Ensure your LICENSE is GPL-compatible * Ensure your LICENSE is GPL-compatible
### Trace Pipe Access Denied ### Trace Pipe Access Denied
@ -563,13 +487,6 @@ If `trace_pipe()` fails:
* Check `/sys/kernel/tracing/` is accessible * Check `/sys/kernel/tracing/` is accessible
* Verify tracing is enabled in kernel config * Verify tracing is enabled in kernel config
### probe_read Failures
If `probe_read()` returns errors:
* Ensure the source address is valid kernel memory
* Check that the size is reasonable
* Verify you're not reading from restricted areas
## Next Steps ## Next Steps
* Explore {doc}`maps` for data storage with helpers * Explore {doc}`maps` for data storage with helpers

View File

@ -6,6 +6,11 @@ Maps are BPF data structures that provide storage and communication mechanisms.
* Share data between multiple BPF programs * Share data between multiple BPF programs
* Communicate with userspace applications * Communicate with userspace applications
```{note}
**Work in Progress:** PythonBPF is under active development. We are constantly adding support for more map types, helpers, and kfuncs. Check back for updates!
```
For comprehensive documentation on BPF maps, see the [eBPF Maps documentation on ebpf.io](https://ebpf.io/what-is-ebpf/#maps).
## Map Types ## Map Types
PythonBPF supports several map types, each optimized for different use cases. PythonBPF supports several map types, each optimized for different use cases.
@ -14,6 +19,8 @@ PythonBPF supports several map types, each optimized for different use cases.
Hash maps provide efficient key-value storage with O(1) lookup time. Hash maps provide efficient key-value storage with O(1) lookup time.
> **Linux Kernel Map Type:** `BPF_MAP_TYPE_HASH`
#### Definition #### Definition
```python ```python
@ -133,6 +140,8 @@ if __name__ == "__main__":
Perf event arrays are used to send data from BPF programs to userspace with high throughput. Perf event arrays are used to send data from BPF programs to userspace with high throughput.
> **Linux Kernel Map Type:** `BPF_MAP_TYPE_PERF_EVENT_ARRAY`
#### Definition #### Definition
```python ```python
@ -227,6 +236,8 @@ def LICENSE() -> str:
Ring buffers provide efficient, ordered event delivery with lower overhead than perf event arrays. Ring buffers provide efficient, ordered event delivery with lower overhead than perf event arrays.
> **Linux Kernel Map Type:** `BPF_MAP_TYPE_RINGBUF`
#### Definition #### Definition
```python ```python