200 Commits

Author SHA1 Message Date
ef128f3752 Merge pull request #79 from pythonbpf/dependabot/github_actions/actions-c2e7f7cad0
Bump the actions group with 2 updates
2025-12-25 17:40:36 +05:30
b92208ed0d Bump the actions group with 2 updates
Bumps the actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [actions/download-artifact](https://github.com/actions/download-artifact).


Updates `actions/upload-artifact` from 5 to 6
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v5...v6)

Updates `actions/download-artifact` from 6 to 7
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v6...v7)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
- dependency-name: actions/download-artifact
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-15 10:24:00 +00:00
2bd8e73724 Add symbolization example 2025-12-09 00:36:50 +05:30
641f8bacbe Merge pull request #78 from pythonbpf/kfunc
move examples to the correct directory
2025-12-08 21:41:52 +05:30
749b06020d move examples to examples folder 2025-12-08 21:39:50 +05:30
0ce5add39b compact the disksnoop.py example
Signed-off-by: varun-r-mallya <varunrmallya@gmail.com>
2025-12-08 16:23:44 +05:30
d0e2360f46 Add anomaly-detection example 2025-12-02 04:01:41 +05:30
049ec55e85 bump version
Signed-off-by: varun-r-mallya <varunrmallya@gmail.com>
2025-11-30 05:55:22 +05:30
77901accf2 fix xdp test c form test
Signed-off-by: varun-r-mallya <varunrmallya@gmail.com>
2025-11-30 05:53:32 +05:30
0616a2fccb Add requirements.txt for BCC-Examples 2025-11-30 05:39:58 +05:30
526425a267 Add command to copy vmlinux.py for container-monitor 2025-11-30 05:37:15 +05:30
466ecdb6a4 Update Python package installation in README 2025-11-30 05:36:16 +05:30
752a10fa5f Add installation instructions to README
Added installation instructions and dependencies section to README.
2025-11-30 05:35:50 +05:30
3602b502f4 Add steps to run container-example in BCC-Examples/README.md 2025-11-30 05:34:08 +05:30
808db2722d Add instructions to run examples in README 2025-11-30 05:27:12 +05:30
99fc5d75cc Refactor disksnoop.py to remove logging and main block
Removed logging setup and main execution block from disksnoop.py.
2025-11-30 04:53:53 +05:30
c91e69e2f7 Add bpftool to installation dependencies 2025-11-30 04:35:15 +05:30
dc995a1448 Fix variable initialization in BPF tracepoint example 2025-11-30 04:31:39 +05:30
0fd6bea211 Fix return value in README example 2025-11-30 04:29:31 +05:30
01d234ac86 Merge pull request #77 from pythonbpf/fix-vmlinux-ir-gen
Add a web dashboard to container monitor
2025-11-28 22:12:47 +05:30
c97efb2570 change web version 2025-11-28 22:11:41 +05:30
76c982e15e Add a web dashboard 2025-11-28 21:30:41 +05:30
2543826e85 Merge pull request #75 from pythonbpf/fix-vmlinux-ir-gen
add container-monitor example
2025-11-28 21:12:45 +05:30
650744f843 beautify container monitor 2025-11-28 21:10:39 +05:30
d73c793989 format chore 2025-11-28 21:02:29 +05:30
bbe4990878 add container-monitor example 2025-11-28 21:02:29 +05:30
600993f626 add syscall monitor 2025-11-28 21:02:28 +05:30
6c55d56ef0 add file io and network stats in container monitor example 2025-11-28 21:02:28 +05:30
704b0d8cd3 Fix debug info generation of PerfEventArray maps 2025-11-28 21:02:27 +05:30
0e50079d88 Add passing test struct_pylib.py 2025-11-28 21:02:27 +05:30
d457f87410 Bump actions/checkout from 5 to 6 in the actions group
Bumps the actions group with 1 update: [actions/checkout](https://github.com/actions/checkout).


Updates `actions/checkout` from 5 to 6
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-11-28 21:02:26 +05:30
4ea02745b3 Janitorial: Remove useless comments 2025-11-28 21:02:26 +05:30
84edddb685 Fix passing test hash_map_struct to include char array test 2025-11-28 21:02:25 +05:30
6f017a9176 Unify struct and pointer to struct handling, abstract null check in ir_ops 2025-11-28 21:02:25 +05:30
24e5829b80 format chore 2025-11-28 14:52:45 +05:30
2daedc5882 Fix debug info generation of PerfEventArray maps 2025-11-28 14:50:40 +05:30
14af7ec4dd add file_io.bpf.py to make container-monitor 2025-11-28 14:47:29 +05:30
536ea4855e add cgroup helper 2025-11-28 14:47:01 +05:30
5ba29db362 fix the c form of the xdp program 2025-11-27 23:44:44 +05:30
0ca835079d Merge pull request #74 from pythonbpf/fix-vmlinux-ir-gen
Fix some issues with vmlinux struct usage in syntax
TODO: THIS DOES NOT FIX the XDP example. 
The issue with the XDP example here is that the range is not being tracked due to multiple loads from stack which is making the verifier lose track of the range on that value.
2025-11-27 23:03:45 +05:30
de8c486461 fix: remove deref_to_depth on single depth pointers 2025-11-27 22:59:34 +05:30
f135cdbcc0 format chore 2025-11-27 14:03:12 +05:30
a8595ff1d2 feat: allocate tmp variable for pointer to vmlinux struct field access. 2025-11-27 14:02:00 +05:30
d43d3ad637 clear disksnoop output 2025-11-27 12:45:48 +05:30
9becee8f77 add expected type check 2025-11-27 12:44:12 +05:30
189526d5ca format chore 2025-11-27 12:42:10 +05:30
1593b7bcfe feat:user defined struct casting 2025-11-27 12:41:57 +05:30
127852ee9f Add passing test struct_pylib.py 2025-11-27 04:45:34 +05:30
4905649700 feat:non struct field values can be cast 2025-11-26 14:18:40 +05:30
7b7b00dbe7 add disksnoop ipynb 2025-11-25 22:51:24 +05:30
102e4ca78c add disksnoop example 2025-11-24 22:50:39 +05:30
2fd4fefbcc Merge pull request #73 from pythonbpf/dependabot/github_actions/actions-76468cb07f
Bump actions/checkout from 5 to 6 in the actions group
2025-11-24 17:15:55 +05:30
016fd5de5c Bump actions/checkout from 5 to 6 in the actions group
Bumps the actions group with 1 update: [actions/checkout](https://github.com/actions/checkout).


Updates `actions/checkout` from 5 to 6
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-11-24 11:44:07 +00:00
8ad5fb8a3a Janitorial: Remove useless comments 2025-11-23 06:29:44 +05:30
bf9635e324 Fix passing test hash_map_struct to include char array test 2025-11-23 06:27:01 +05:30
cbe365d760 Unify struct and pointer to struct handling, abstract null check in ir_ops 2025-11-23 06:26:09 +05:30
fed6af1ed6 bump version and prepare for release 2025-11-22 13:54:41 +05:30
18886816fb Merge pull request #68 from pythonbpf/request-struct
Support enough machinery to make request struct work
2025-11-22 13:48:06 +05:30
a2de15fb1e add c_int to type_deducer.py 2025-11-22 13:36:21 +05:30
9def969592 Make map val struct type allocation work by fixing pointer deref and debuginfogen: WIP 2025-11-22 13:20:09 +05:30
081ee5cb4c move requests.py to passing tests 2025-11-22 13:19:55 +05:30
a91c3158ad sort fields in debug info by offset order 2025-11-22 12:35:47 +05:30
2b3635fe20 format chore 2025-11-22 01:48:44 +05:30
6f25c554a9 fix CO-RE read for cast structs 2025-11-22 01:47:25 +05:30
84507b8b98 add btf probe read kernel helper 2025-11-22 00:57:12 +05:30
a42a75179d format chore 2025-11-22 00:37:39 +05:30
377fa4041d add regular struct field access handling in vmlinux_registry.py 2025-11-22 00:36:59 +05:30
99321c7669 add a failing C test 2025-11-21 23:01:08 +05:30
11850d16d3 field check in allocation pass 2025-11-21 21:47:58 +05:30
9ee821c7f6 make pointer allocation feasible but subverting LLC 2025-11-21 21:47:55 +05:30
25394059a6 allow casting 2025-11-21 21:47:10 +05:30
fde8eab775 allow allocation pass on vmlinux cast 2025-11-21 21:47:07 +05:30
42b8865a56 Merge branch 'master' into request-struct 2025-11-21 02:10:52 +05:30
144d9b0ab4 change c-file test structure 2025-11-20 17:24:02 +05:30
902a52a07d remove debug print statements 2025-11-20 14:39:13 +05:30
306570953b format chore 2025-11-20 14:18:45 +05:30
740eed45e1 add placeholder debug info to shut llvmlite up about NoneType 2025-11-20 14:17:57 +05:30
c8801f4c3e nonetype not parsed 2025-11-19 23:35:10 +05:30
e5b3b001ce Minor fix for PTR_TO_MAP_VALUE_OR_NULL target 2025-11-19 04:29:35 +05:30
19b42b9a19 Allocate hashmap lookup return vars based on the value type of said hashmap 2025-11-19 04:09:51 +05:30
9f5ec62383 Add get_uint8_type to DebugInfoGenerator 2025-11-19 03:24:40 +05:30
7af54df7c0 Add passing test hash_map_struct.py for using structs as hashmap key/val types 2025-11-19 00:17:01 +05:30
573bbb350e Allow structs to be key/val type for hashmaps 2025-11-19 00:08:15 +05:30
64679f8072 Add skeleton _get_key_val_dbg_type in maps_debug_info.py 2025-11-18 05:00:00 +05:30
5667facf23 Pass down structs_sym_tab to maps_debug_info, allow vmlinux enums to be used in an indexed format for map declaration 2025-11-18 04:34:51 +05:30
4f8af16a17 Pass structs_sym_tab to maps_proc 2025-11-18 04:34:42 +05:30
b84884162d Merge pull request #69 from pythonbpf/symex
Add support for userspace+kernelspace stack trace example using blazesym
2025-11-17 01:47:35 +05:30
e9bb90cb70 Add docstring for bpf_get_stack_emitter 2025-11-17 01:46:57 +05:30
49740598ea format chore 2025-11-13 09:31:10 +05:30
73bbf00e7c add tests 2025-11-13 09:29:53 +05:30
9d76502d5a Fix get_flags_val usage 2025-11-13 02:24:35 +05:30
a10da4a277 Implement bpf_get_stack handler 2025-11-13 00:59:50 +05:30
29e90601b7 Init bpf_get_stack emitter 2025-11-13 00:51:48 +05:30
56df05a93c Janitorial formatting 2025-11-12 14:38:35 +05:30
a55efc6469 Implement output helper for RingBuf maps, add a match-case based dispatch for output helper handlers for multiple map types 2025-11-12 14:06:09 +05:30
64cd2d2fc2 Set minimum supported Python version to 3.10 2025-11-12 14:06:00 +05:30
cbddc0aa96 Introduce MapSymbol to propagate map type info in map_sym_tab 2025-11-12 13:16:23 +05:30
209df33c8f Add RingBuf submit and reserve helpers 2025-11-12 03:53:16 +05:30
7a56e5d0cd Initialize required helpers for ringbuffer 2025-11-12 01:59:07 +05:30
1d7a436c9f Add linting function for RingBuf.discard 2025-11-12 01:30:15 +05:30
5eaeb3e921 Add max_entries constraints for RingBuffer 2025-11-12 01:27:41 +05:30
cd52d0d91b Rename RingBuf map to RingBuffer 2025-11-12 01:07:12 +05:30
df981be095 Janitorial format 2025-11-11 21:08:06 +05:30
316c21c428 Fix char_array to pointer/int detection fallback in helper_utils 2025-11-11 21:00:42 +05:30
c883d95655 Minor fix - check expr type before sending to char_array handler in printk_formatter 2025-11-11 17:43:20 +05:30
f7dee329cb fix nested pointers issue in array generation and also fix zero length array IR generation 2025-11-10 20:29:28 +05:30
5031f90377 fix stacked vmlinux struct parsing issue 2025-11-10 20:06:04 +05:30
95a624044a fix type error 2025-11-08 20:28:56 +05:30
c5bef26b88 add multi imports to single import line. 2025-11-08 18:08:04 +05:30
5a8b64f1d9 Merge pull request #64 from pythonbpf/all_helpers
Add support for all eBPF helpers
2025-11-07 19:26:55 +05:30
cf99b3bb9a Fix call to get_or_create_ptr_from_arg for probe_read_str 2025-11-07 19:16:48 +05:30
6c85b248ce Init sz in get_or_create_ptr_from_arg 2025-11-07 19:03:21 +05:30
b5a3494cc6 Fix typo in get_or_create_ptr_from_arg 2025-11-07 19:01:40 +05:30
be62972974 Fix ScratchPoolManager::counter 2025-11-07 19:00:57 +05:30
2f4a7d2f90 Remove get_struct_char_array_ptr in favour of get_char_array_ptr_and_size, wrap it in get_or_crate_ptr_from_arg to use in bpf_helper_handler 2025-11-07 18:54:59 +05:30
3ccd3f767e Add expected types for pointer creation of args in probe_read handler 2025-11-06 19:59:04 +05:30
2e37726922 Add signature relection for all helper handlers except print 2025-11-06 19:47:57 +05:30
5b36726b7d Make bpf_skb_store_bytes work 2025-11-05 20:02:39 +05:30
faad3555dc Merge pull request #67 from pythonbpf/32int_support
add i32 support and special support for xdp_md with zext
2025-11-05 19:42:05 +05:30
3e6cea2b67 Move get_struct_char_array_ptr from helper/printk_formatter to helper/helper_utils, enable array to ptr conversion in skb_store_bytes 2025-11-05 19:10:58 +05:30
5ad33b011e move a test to passing 2025-11-05 18:02:28 +05:30
2f4785b796 add int type conversion for all vmlinux struct field int types. 2025-11-05 18:01:41 +05:30
c5fdd3bce2 move some tests to passing 2025-11-05 17:48:26 +05:30
b0d35693b9 format chore 2025-11-05 17:44:45 +05:30
44c6ceda27 fix context debug info repetition circular reference error 2025-11-05 17:44:29 +05:30
2685d0a0ee add i32 support special case and find ctx repetition in multiple functions error. 2025-11-05 17:38:38 +05:30
338d4994d8 Fix count_temps_in_call to only look for Pointer args of a helper_sig 2025-11-05 17:36:37 +05:30
3078d4224d Add typed scratch space support to the bpf_skb_store_bytes helper 2025-11-04 16:09:11 +05:30
7d29790f00 Make use of new get_next_temp in helpers 2025-11-04 16:02:56 +05:30
963e2a8171 Change ScratchPoolManager to use typed scratch space 2025-11-04 14:16:44 +05:30
123a92af1d Change allocation pass to generate typed temp variables 2025-11-04 06:20:39 +05:30
752f564d3f Change count_temps_in_call to return hashmap of types 2025-11-04 05:40:22 +05:30
d8cddb9799 Add signature extraction to HelperHandlerRegistry 2025-11-04 05:19:22 +05:30
33e18f6d6d Introduce HelperSignature in HelperHandlerRegistry 2025-11-03 21:21:13 +05:30
5e371787eb Fix the number of args for skb_store_bytes by making the first arg implicit 2025-11-03 21:11:16 +05:30
67c9d9b932 Fix imports for bpf_skb_store_bytes 2025-11-02 04:33:45 +05:30
f757a32a63 Implement bpf_skb_store_bytes_emitter 2025-11-02 04:32:05 +05:30
c5de92b9d0 Add BPF_SKB_STORE_BYTES to HelperIDs 2025-11-02 04:17:15 +05:30
4efd3223cd Add passing uid_gid helper test 2025-11-02 03:47:26 +05:30
4884ed7577 Fix imports for bpf_get_current_uid_gid 2025-11-02 03:35:41 +05:30
5b7769dd38 Implement bpf_get_current_uid_gid_emitter 2025-11-02 03:34:04 +05:30
b7c1e92f05 Add BPF_GET_CURRENT_UID_GID to HelperIDs 2025-11-02 03:29:02 +05:30
8b28a927c3 Add helpful TODO to PID_TGID emitter 2025-11-02 03:27:27 +05:30
3489f45b63 Add failing XDP vmlinux tests 2025-11-01 18:57:07 +05:30
204ec26154 add i32 support and make it extensible 2025-11-01 14:44:39 +05:30
f9ee43e7ef Add passing test smp_processor_id.py for helpers 2025-11-01 14:13:52 +05:30
dabb8bf0df Fix imports for BPF_GET_SMP_PROCESSOR_ID 2025-11-01 14:07:47 +05:30
19dedede53 Implement BPF_GET_SMP_PROCESSOR_ID helper 2025-11-01 14:05:50 +05:30
82cac8f8ef Add BPF_GET_SMP_PROCESSOR_ID to HelperIDs 2025-11-01 14:02:07 +05:30
70a04f54d1 Add passing test for bpf_probe_read helper 2025-11-01 13:51:08 +05:30
ec2ea835e5 Fix imports and type issues for bpf_probe_read 2025-11-01 13:50:23 +05:30
2257c175ed Implement BPF_PROBE_READ helper 2025-11-01 13:14:50 +05:30
5bf60d69b8 Add BPF_PROBE_READ to HelperIDs 2025-11-01 12:52:15 +05:30
a9d82d40d3 Merge pull request #60 from pythonbpf/vmlinux-handler
vmlinux handler with struct support for only int64 and unsigned uint64 type struct fields.
2025-11-01 08:15:14 +05:30
85a62d6cd8 add example and support unsigned i64 2025-11-01 08:13:22 +05:30
c3fc790c71 remove fixed TODOs 2025-11-01 07:05:42 +05:30
22e30f04b4 Merge pull request #66 from pythonbpf/dependabot/github_actions/actions-3249c11fdc
Bump the actions group with 2 updates
2025-10-27 17:21:49 +05:30
620b8cb1e7 Bump the actions group with 2 updates
Bumps the actions group with 2 updates: [actions/upload-artifact](https://github.com/actions/upload-artifact) and [actions/download-artifact](https://github.com/actions/download-artifact).


Updates `actions/upload-artifact` from 4 to 5
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v4...v5)

Updates `actions/download-artifact` from 5 to 6
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
- dependency-name: actions/download-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-27 11:49:59 +00:00
1207fe9f92 Update .gitattributes to include new directories 2025-10-27 03:43:38 +05:30
b138405931 Merge pull request #65 from pythonbpf/varun-r-mallya-patch-1
Mark Jupyter Notebook files as vendored
2025-10-27 03:41:59 +05:30
262f00f635 Mark Jupyter Notebook files as vendored 2025-10-27 03:41:15 +05:30
07580dabf2 revert struct reference pointer sizes to i8 to ensure that compiler does not optimize 2025-10-27 03:29:15 +05:30
ac74b03b14 Add TODO to specify flags and DISubprogram. 2025-10-27 03:01:56 +05:30
3bf85e733e add DI subprogram to make CO-RE work fully. 2025-10-27 03:00:13 +05:30
73f7c80eca add scope field separately to subroutine type remove circular dependency 2025-10-27 02:48:06 +05:30
238697469a create debug info to subroutine type 2025-10-27 02:19:08 +05:30
0006e26b08 Add passing test for bpf_get_prandom_u32 implementation 2025-10-27 01:09:27 +05:30
5cbd9a531e Add bpf_get_prandom_u32 helper 2025-10-27 01:08:56 +05:30
8bd210cede add debug info storage on assignment_info.py dataclass 2025-10-26 15:46:42 +05:30
7bf6f9c48c add function_debug_info.py and format 2025-10-26 15:12:36 +05:30
a1fe2ed4bc change to 64 bit pointers. May be an issue. revert this commit if issues arise 2025-10-26 15:00:53 +05:30
93285dbdd8 geenrate gep IR 2025-10-26 02:12:33 +05:30
1ea44dd8e1 Use pointer arithmetic to resolve vmlinux struct fields 2025-10-25 05:40:45 +05:30
96216d4411 Consistently use Dataclass syntac for AssignmentInfo and related classes 2025-10-25 05:10:47 +05:30
028d9c2c08 generate IR partly 2025-10-25 04:41:13 +05:30
c6b5ecb47e find global variable ir and field data from metadata 2025-10-24 03:34:27 +05:30
30bcfcbbd0 remove compile error on normal c_void_p in arg and separate localsymbol to avoid circular dep 2025-10-24 03:08:39 +05:30
f18a4399ea format chore 2025-10-24 02:40:07 +05:30
4e01df735f complete part of expr passing for attribute of i64 type 2025-10-24 02:38:39 +05:30
64674cf646 add alloc for only i64 2025-10-24 02:06:39 +05:30
5c1e7103a6 Add Python notebook examples for current BCC examples 2025-10-23 00:28:45 +05:30
576fa2f106 Add interactive Python notebook for hello_world BCC Example 2025-10-22 21:58:32 +05:30
76a873cb0d Update clone-matplotlib example 2025-10-22 21:47:16 +05:30
e86c6082c9 Add BCC examples and change dir structure in setup.sh 2025-10-22 21:46:45 +05:30
cb1ad15f43 Fix examples/clone_plot to use new syntax and pylibbpf API 2025-10-22 20:54:59 +05:30
b24b3ed250 Remove TypedDict from assignment_info in favour of dataclasses 2025-10-22 20:48:56 +05:30
beaad996db Fix map access syntax in examples/xdp_pass 2025-10-22 20:07:38 +05:30
99b92e44e3 Fix exapmles/kprobes to use latest pylibbpf 2025-10-22 20:04:02 +05:30
ce7adaadb6 Fix examples/hello_world to use latest pylibbpf 2025-10-22 19:57:47 +05:30
5ac316a1ac Fix examples/binops_demo.py syntax 2025-10-22 19:52:23 +05:30
36a1a0903e Merge branch 'master' into vmlinux-handler 2025-10-22 12:02:51 +05:30
f2bc7f1434 pass context to memory allocation 2025-10-22 12:01:52 +05:30
b3921c424d parse context from first function argument to local symbol table 2025-10-22 11:40:49 +05:30
7a99d21b24 Fix typo in BCC Examples README 2025-10-22 04:49:22 +05:30
cf05e4959d Add README for BCC Examples 2025-10-22 04:45:09 +05:30
adf32560a0 bpf passthrough gen in codegen
Signed-off-by: varun-r-mallya <varunrmallya@gmail.com>
2025-10-22 03:45:54 +05:30
21cea97d78 add return None statements 2025-10-21 07:02:34 +05:30
d8729342dc add bpf_passthrough generation 2025-10-21 07:01:37 +05:30
4179fbfc88 move around examples 2025-10-21 06:03:16 +05:30
ba397036b4 add failing examples to work on 2025-10-21 05:49:44 +05:30
98 changed files with 8737 additions and 122415 deletions

2
.gitattributes vendored
View File

@ -1 +1,3 @@
tests/c-form/vmlinux.h linguist-vendored
examples/ linguist-vendored
BCC-Examples/ linguist-vendored

View File

@ -12,7 +12,7 @@ jobs:
name: Format
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
with:
python-version: "3.x"

View File

@ -20,7 +20,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
with:
@ -33,7 +33,7 @@ jobs:
python -m build
- name: Upload distributions
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v6
with:
name: release-dists
path: dist/
@ -59,7 +59,7 @@ jobs:
steps:
- name: Retrieve release distributions
uses: actions/download-artifact@v5
uses: actions/download-artifact@v7
with:
name: release-dists
path: dist/

31
BCC-Examples/README.md Normal file
View File

@ -0,0 +1,31 @@
## BCC examples ported to PythonBPF
This folder contains examples of BCC tutorial examples that have been ported to use **PythonBPF**.
## Requirements
- install `pythonbpf` and `pylibbpf` using pip.
- You will also need `matplotlib` for vfsreadlat.py example.
- You will also need `rich` for vfsreadlat_rich.py example.
- You will also need `plotly` and `dash` for vfsreadlat_plotly.py example.
- All of these are added to `requirements.txt` file. You can install them using the following command:
```bash
pip install -r requirements.txt
```
## Usage
- You'll need root privileges to run these examples. If you are using a virtualenv, use the following command to run the scripts:
```bash
sudo <path_to_virtualenv>/bin/python3 <script_name>.py
```
- For the disksnoop and container-monitor examples, you need to generate the vmlinux.py file first. Follow the instructions in the [main README](https://github.com/pythonbpf/Python-BPF/tree/master?tab=readme-ov-file#first-generate-the-vmlinuxpy-file-for-your-kernel) to generate the vmlinux.py file.
- For vfsreadlat_plotly.py, run the following command to start the Dash server:
```bash
sudo <path_to_virtualenv>/bin/python3 vfsreadlat_plotly/bpf_program.py
```
Then open your web browser and navigate to the given URL.
- For container-monitor, you need to first copy the vmlinux.py to `container-monitor/` directory.
Then run the following command to run the example:
```bash
cp vmlinux.py container-monitor/
sudo <path_to_virtualenv>/bin/python3 container-monitor/container_monitor.py
```

View File

@ -0,0 +1,122 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "c3520e58-e50f-4bc1-8f9d-a6fecbf6e9f0",
"metadata": {},
"outputs": [],
"source": [
"from vmlinux import struct_request, struct_pt_regs\n",
"from pythonbpf import bpf, section, bpfglobal, map, BPF\n",
"from pythonbpf.helper import ktime\n",
"from pythonbpf.maps import HashMap\n",
"from ctypes import c_int64, c_uint64, c_int32\n",
"\n",
"REQ_WRITE = 1\n",
"\n",
"\n",
"@bpf\n",
"@map\n",
"def start() -> HashMap:\n",
" return HashMap(key=c_uint64, value=c_uint64, max_entries=10240)\n",
"\n",
"\n",
"@bpf\n",
"@section(\"kprobe/blk_mq_end_request\")\n",
"def trace_completion(ctx: struct_pt_regs) -> c_int64:\n",
" # Get request pointer from first argument\n",
" req_ptr = ctx.di\n",
" req = struct_request(ctx.di)\n",
" # Print: data_len, cmd_flags, latency_us\n",
" data_len = req.__data_len\n",
" cmd_flags = req.cmd_flags\n",
" # Lookup start timestamp\n",
" req_tsp = start.lookup(req_ptr)\n",
" if req_tsp:\n",
" # Calculate delta in nanoseconds\n",
" delta = ktime() - req_tsp\n",
"\n",
" # Convert to microseconds for printing\n",
" delta_us = delta // 1000\n",
"\n",
" print(f\"{data_len} {cmd_flags:x} {delta_us}\\n\")\n",
"\n",
" # Delete the entry\n",
" start.delete(req_ptr)\n",
"\n",
" return c_int64(0)\n",
"\n",
"\n",
"@bpf\n",
"@section(\"kprobe/blk_mq_start_request\")\n",
"def trace_start(ctx1: struct_pt_regs) -> c_int32:\n",
" req = ctx1.di\n",
" ts = ktime()\n",
" start.update(req, ts)\n",
" return c_int32(0)\n",
"\n",
"\n",
"@bpf\n",
"@bpfglobal\n",
"def LICENSE() -> str:\n",
" return \"GPL\"\n",
"\n",
"\n",
"b = BPF()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "97040f73-98e0-4993-94c6-125d1b42d931",
"metadata": {},
"outputs": [],
"source": [
"b.load()\n",
"b.attach_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b1bd4f51-fa25-42e1-877c-e48a2605189f",
"metadata": {},
"outputs": [],
"source": [
"from pythonbpf import trace_pipe"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "96b4b59b-b0db-4952-9534-7a714f685089",
"metadata": {},
"outputs": [],
"source": [
"trace_pipe()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

48
BCC-Examples/disksnoop.py Normal file
View File

@ -0,0 +1,48 @@
from ctypes import c_int32, c_int64, c_uint64
from vmlinux import struct_pt_regs, struct_request
from pythonbpf import bpf, bpfglobal, compile, map, section
from pythonbpf.helper import ktime
from pythonbpf.maps import HashMap
@bpf
@map
def start() -> HashMap:
return HashMap(key=c_uint64, value=c_uint64, max_entries=10240)
@bpf
@section("kprobe/blk_mq_end_request")
def trace_completion(ctx: struct_pt_regs) -> c_int64:
req_ptr = ctx.di
req = struct_request(ctx.di)
data_len = req.__data_len
cmd_flags = req.cmd_flags
req_tsp = start.lookup(req_ptr)
if req_tsp:
delta = ktime() - req_tsp
delta_us = delta // 1000
print(f"{data_len} {cmd_flags:x} {delta_us}\n")
start.delete(req_ptr)
return c_int64(0)
@bpf
@section("kprobe/blk_mq_start_request")
def trace_start(ctx1: struct_pt_regs) -> c_int32:
req = ctx1.di
ts = ktime()
start.update(req, ts)
return c_int32(0)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile()

View File

@ -0,0 +1,83 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "28cf2e27-41e2-461c-a39c-147417141a4e",
"metadata": {},
"outputs": [],
"source": [
"from pythonbpf import bpf, section, bpfglobal, BPF, trace_fields\n",
"from ctypes import c_void_p, c_int64"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "133190e5-5a99-4585-b6e1-91224ed973c2",
"metadata": {},
"outputs": [],
"source": [
"@bpf\n",
"@section(\"tracepoint/syscalls/sys_enter_clone\")\n",
"def hello_world(ctx: c_void_p) -> c_int64:\n",
" print(\"Hello, World!\")\n",
" return 0\n",
"\n",
"\n",
"@bpf\n",
"@bpfglobal\n",
"def LICENSE() -> str:\n",
" return \"GPL\"\n",
"\n",
"\n",
"# Compile and load\n",
"b = BPF()\n",
"b.load()\n",
"b.attach_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d3934efb-4043-4545-ae4c-c50ec40a24fd",
"metadata": {},
"outputs": [],
"source": [
"# header\n",
"print(f\"{'TIME(s)':<18} {'COMM':<16} {'PID':<6} {'MESSAGE'}\")\n",
"\n",
"# format output\n",
"while True:\n",
" try:\n",
" (task, pid, cpu, flags, ts, msg) = trace_fields()\n",
" except ValueError:\n",
" continue\n",
" except KeyboardInterrupt:\n",
" exit()\n",
" print(f\"{ts:<18} {task:<16} {pid:<6} {msg}\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@ -0,0 +1,110 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "79b74928-f4b4-4320-96e3-d973997de2f4",
"metadata": {},
"outputs": [],
"source": [
"from pythonbpf import bpf, map, struct, section, bpfglobal, BPF\n",
"from pythonbpf.helper import ktime, pid, comm\n",
"from pythonbpf.maps import PerfEventArray\n",
"from ctypes import c_void_p, c_int64"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5bdb0329-ae2d-45e8-808e-5ed5b1374204",
"metadata": {},
"outputs": [],
"source": [
"@bpf\n",
"@struct\n",
"class data_t:\n",
" pid: c_int64\n",
" ts: c_int64\n",
" comm: str(16)\n",
"\n",
"\n",
"@bpf\n",
"@map\n",
"def events() -> PerfEventArray:\n",
" return PerfEventArray(key_size=c_int64, value_size=c_int64)\n",
"\n",
"\n",
"@bpf\n",
"@section(\"tracepoint/syscalls/sys_enter_clone\")\n",
"def hello(ctx: c_void_p) -> c_int64:\n",
" dataobj = data_t()\n",
" dataobj.pid, dataobj.ts = pid(), ktime()\n",
" comm(dataobj.comm)\n",
" events.output(dataobj)\n",
" return 0\n",
"\n",
"\n",
"@bpf\n",
"@bpfglobal\n",
"def LICENSE() -> str:\n",
" return \"GPL\"\n",
"\n",
"\n",
"# Compile and load\n",
"b = BPF()\n",
"b.load()\n",
"b.attach_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4bcc7d57-6cc4-48a3-bbd2-42ad6263afdf",
"metadata": {},
"outputs": [],
"source": [
"start = 0\n",
"\n",
"\n",
"def callback(cpu, event):\n",
" global start\n",
" if start == 0:\n",
" start = event.ts\n",
" ts = (event.ts - start) / 1e9\n",
" print(f\"[CPU {cpu}] PID: {event.pid}, TS: {ts}, COMM: {event.comm.decode()}\")\n",
"\n",
"\n",
"perf = b[\"events\"].open_perf_buffer(callback, struct_name=\"data_t\")\n",
"print(\"Starting to poll... (Ctrl+C to stop)\")\n",
"print(\"Try running: fork() or clone() system calls to trigger events\")\n",
"\n",
"try:\n",
" while True:\n",
" b[\"events\"].poll(1000)\n",
"except KeyboardInterrupt:\n",
" print(\"Stopping...\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@ -0,0 +1,116 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 9,
"id": "7d5d3cfb-39ba-4516-9856-b3bed47a0cef",
"metadata": {},
"outputs": [],
"source": [
"from pythonbpf import bpf, section, bpfglobal, BPF, trace_pipe\n",
"from ctypes import c_void_p, c_int64"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "cf1c87aa-e173-4156-8f2d-762225bc6d19",
"metadata": {},
"outputs": [],
"source": [
"@bpf\n",
"@section(\"tracepoint/syscalls/sys_enter_clone\")\n",
"def hello_world(ctx: c_void_p) -> c_int64:\n",
" print(\"Hello, World!\")\n",
" return 0\n",
"\n",
"\n",
"@bpf\n",
"@bpfglobal\n",
"def LICENSE() -> str:\n",
" return \"GPL\"\n",
"\n",
"\n",
"b = BPF()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bd81383d-f75a-4269-8451-3d985d85b124",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Cache2 I/O-4716 [003] ...21 8218.000492: bpf_trace_printk: count: 11 with 4716\n",
"\n",
" Cache2 I/O-4716 [003] ...21 8218.000499: bpf_trace_printk: Hello, World!\n",
"\n",
" WebExtensions-5168 [002] ...21 8219.320392: bpf_trace_printk: count: 13 with 5168\n",
"\n",
" WebExtensions-5168 [002] ...21 8219.320399: bpf_trace_printk: Hello, World!\n",
"\n",
" python-21155 [001] ...21 8220.933716: bpf_trace_printk: count: 5 with 21155\n",
"\n",
" python-21155 [001] ...21 8220.933721: bpf_trace_printk: Hello, World!\n",
"\n",
" python-21155 [002] ...21 8221.341290: bpf_trace_printk: count: 6 with 21155\n",
"\n",
" python-21155 [002] ...21 8221.341295: bpf_trace_printk: Hello, World!\n",
"\n",
" Isolated Web Co-5462 [000] ...21 8223.095033: bpf_trace_printk: count: 7 with 5462\n",
"\n",
" Isolated Web Co-5462 [000] ...21 8223.095043: bpf_trace_printk: Hello, World!\n",
"\n",
" firefox-4542 [000] ...21 8227.760067: bpf_trace_printk: count: 8 with 4542\n",
"\n",
" firefox-4542 [000] ...21 8227.760080: bpf_trace_printk: Hello, World!\n",
"\n",
" Isolated Web Co-12404 [003] ...21 8227.917086: bpf_trace_printk: count: 7 with 12404\n",
"\n",
" Isolated Web Co-12404 [003] ...21 8227.917095: bpf_trace_printk: Hello, World!\n",
"\n"
]
}
],
"source": [
"b.load()\n",
"b.attach_all()\n",
"trace_pipe()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "01e1f25b-decc-425b-a1aa-a5e701082574",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@ -0,0 +1,9 @@
# =============================================================================
# Requirements for PythonBPF BCC-Examples
# =============================================================================
dash
matplotlib
numpy
plotly
rich

View File

@ -0,0 +1,107 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "dcab010c-f5e9-446f-9f9f-056cc794ad14",
"metadata": {},
"outputs": [],
"source": [
"from pythonbpf import bpf, map, section, bpfglobal, BPF, trace_fields\n",
"from pythonbpf.helper import ktime\n",
"from pythonbpf.maps import HashMap\n",
"\n",
"from ctypes import c_void_p, c_int64"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "720797e8-9c81-4af6-a385-80f1ec4c0f15",
"metadata": {},
"outputs": [],
"source": [
"@bpf\n",
"@map\n",
"def last() -> HashMap:\n",
" return HashMap(key=c_int64, value=c_int64, max_entries=2)\n",
"\n",
"\n",
"@bpf\n",
"@section(\"tracepoint/syscalls/sys_enter_sync\")\n",
"def do_trace(ctx: c_void_p) -> c_int64:\n",
" ts_key, cnt_key = 0, 1\n",
" tsp, cntp = last.lookup(ts_key), last.lookup(cnt_key)\n",
" if not cntp:\n",
" last.update(cnt_key, 0)\n",
" cntp = last.lookup(cnt_key)\n",
" if tsp:\n",
" delta = ktime() - tsp\n",
" if delta < 1000000000:\n",
" time_ms = delta // 1000000\n",
" print(f\"{time_ms} {cntp}\")\n",
" last.delete(ts_key)\n",
" else:\n",
" last.update(ts_key, ktime())\n",
" last.update(cnt_key, cntp + 1)\n",
" return 0\n",
"\n",
"\n",
"@bpf\n",
"@bpfglobal\n",
"def LICENSE() -> str:\n",
" return \"GPL\"\n",
"\n",
"\n",
"# Compile and load\n",
"b = BPF()\n",
"b.load()\n",
"b.attach_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "78a8b82c-7c5f-43c1-9de1-cd982a0f345b",
"metadata": {},
"outputs": [],
"source": [
"print(\"Tracing for quick sync's... Ctrl-C to end\")\n",
"\n",
"# format output\n",
"start = 0\n",
"while True:\n",
" try:\n",
" task, pid, cpu, flags, ts, msg = trace_fields()\n",
" if start == 0:\n",
" start = ts\n",
" ts -= start\n",
" ms, cnt = msg.split()\n",
" print(f\"At time {ts} s: Multiple syncs detected, last {ms} ms ago. Count {cnt}\")\n",
" except KeyboardInterrupt:\n",
" exit()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@ -0,0 +1,134 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "b0d1ab05-0c1f-4578-9c1b-568202b95a5c",
"metadata": {},
"outputs": [],
"source": [
"from pythonbpf import bpf, map, struct, section, bpfglobal, BPF\n",
"from pythonbpf.helper import ktime\n",
"from pythonbpf.maps import HashMap, PerfEventArray\n",
"from ctypes import c_void_p, c_int64"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "85e50d0a-f9d8-468f-8e03-f5f7128f05d8",
"metadata": {},
"outputs": [],
"source": [
"@bpf\n",
"@struct\n",
"class data_t:\n",
" ts: c_int64\n",
" ms: c_int64\n",
"\n",
"\n",
"@bpf\n",
"@map\n",
"def events() -> PerfEventArray:\n",
" return PerfEventArray(key_size=c_int64, value_size=c_int64)\n",
"\n",
"\n",
"@bpf\n",
"@map\n",
"def last() -> HashMap:\n",
" return HashMap(key=c_int64, value=c_int64, max_entries=1)\n",
"\n",
"\n",
"@bpf\n",
"@section(\"tracepoint/syscalls/sys_enter_sync\")\n",
"def do_trace(ctx: c_void_p) -> c_int64:\n",
" dat, dat.ts, key = data_t(), ktime(), 0\n",
" tsp = last.lookup(key)\n",
" if tsp:\n",
" delta = ktime() - tsp\n",
" if delta < 1000000000:\n",
" dat.ms = delta // 1000000\n",
" events.output(dat)\n",
" last.delete(key)\n",
" else:\n",
" last.update(key, ktime())\n",
" return 0\n",
"\n",
"\n",
"@bpf\n",
"@bpfglobal\n",
"def LICENSE() -> str:\n",
" return \"GPL\"\n",
"\n",
"\n",
"# Compile and load\n",
"b = BPF()\n",
"b.load()\n",
"b.attach_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "40bb1107-369f-4be7-9f10-37201900c16b",
"metadata": {},
"outputs": [],
"source": [
"print(\"Tracing for quick sync's... Ctrl-C to end\")\n",
"\n",
"# format output\n",
"start = 0\n",
"\n",
"\n",
"def callback(cpu, event):\n",
" global start\n",
" if start == 0:\n",
" start = event.ts\n",
" event.ts -= start\n",
" print(\n",
" f\"At time {event.ts / 1e9} s: Multiple sync detected, Last sync: {event.ms} ms ago\"\n",
" )\n",
"\n",
"\n",
"perf = b[\"events\"].open_perf_buffer(callback, struct_name=\"data_t\")\n",
"print(\"Starting to poll... (Ctrl+C to stop)\")\n",
"print(\"Try running: fork() or clone() system calls to trigger events\")\n",
"\n",
"try:\n",
" while True:\n",
" b[\"events\"].poll(1000)\n",
"except KeyboardInterrupt:\n",
" print(\"Stopping...\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "94a588d9-3a40-437c-a35b-fc40410f3eb7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@ -1,7 +1,6 @@
from pythonbpf import bpf, map, struct, section, bpfglobal, BPF
from pythonbpf.helper import ktime
from pythonbpf.maps import HashMap
from pythonbpf.maps import PerfEventArray
from pythonbpf.maps import HashMap, PerfEventArray
from ctypes import c_void_p, c_int64
@ -69,8 +68,6 @@ def callback(cpu, event):
perf = b["events"].open_perf_buffer(callback, struct_name="data_t")
print("Starting to poll... (Ctrl+C to stop)")
print("Try running: fork() or clone() system calls to trigger events")
try:
while True:
b["events"].poll(1000)

View File

@ -0,0 +1,102 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "bfe01ceb-2f27-41b3-b3ba-50ec65cfddda",
"metadata": {},
"outputs": [],
"source": [
"from pythonbpf import bpf, map, section, bpfglobal, BPF, trace_fields\n",
"from pythonbpf.helper import ktime\n",
"from pythonbpf.maps import HashMap\n",
"\n",
"from ctypes import c_void_p, c_int64"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ddb115f4-20a7-43bc-bb5b-ccbfd6031fc2",
"metadata": {},
"outputs": [],
"source": [
"@bpf\n",
"@map\n",
"def last() -> HashMap:\n",
" return HashMap(key=c_int64, value=c_int64, max_entries=1)\n",
"\n",
"\n",
"@bpf\n",
"@section(\"tracepoint/syscalls/sys_enter_sync\")\n",
"def do_trace(ctx: c_void_p) -> c_int64:\n",
" key = 0\n",
" tsp = last.lookup(key)\n",
" if tsp:\n",
" delta = ktime() - tsp\n",
" if delta < 1000000000:\n",
" time_ms = delta // 1000000\n",
" print(f\"{time_ms}\")\n",
" last.delete(key)\n",
" else:\n",
" last.update(key, ktime())\n",
" return 0\n",
"\n",
"\n",
"@bpf\n",
"@bpfglobal\n",
"def LICENSE() -> str:\n",
" return \"GPL\"\n",
"\n",
"\n",
"# Compile and load\n",
"b = BPF()\n",
"b.load()\n",
"b.attach_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e4f46574-9fd8-46e7-9c7b-27a36d07f200",
"metadata": {},
"outputs": [],
"source": [
"print(\"Tracing for quick sync's... Ctrl-C to end\")\n",
"\n",
"# format output\n",
"start = 0\n",
"while True:\n",
" try:\n",
" task, pid, cpu, flags, ts, ms = trace_fields()\n",
" if start == 0:\n",
" start = ts\n",
" ts -= start\n",
" print(f\"At time {ts} s: Multiple syncs detected, last {ms} ms ago\")\n",
" except KeyboardInterrupt:\n",
" exit()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@ -0,0 +1,73 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "bb49598f-b9cc-4ea8-8391-923cad513711",
"metadata": {},
"outputs": [],
"source": [
"from pythonbpf import bpf, section, bpfglobal, BPF, trace_pipe\n",
"from ctypes import c_void_p, c_int64"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5da237b0-1c7d-4ec5-8c24-696b1c1d97fa",
"metadata": {},
"outputs": [],
"source": [
"@bpf\n",
"@section(\"tracepoint/syscalls/sys_enter_sync\")\n",
"def hello_world(ctx: c_void_p) -> c_int64:\n",
" print(\"sys_sync() called\")\n",
" return 0\n",
"\n",
"\n",
"@bpf\n",
"@bpfglobal\n",
"def LICENSE() -> str:\n",
" return \"GPL\"\n",
"\n",
"\n",
"# Compile and load\n",
"b = BPF()\n",
"b.load()\n",
"b.attach_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e4c218ac-fe47-4fd1-a27b-c07e02f3cd05",
"metadata": {},
"outputs": [],
"source": [
"print(\"Tracing sys_sync()... Ctrl-C to end.\")\n",
"trace_pipe()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because one or more lines are too long

View File

@ -40,16 +40,11 @@ Python-BPF is an LLVM IR generator for eBPF programs written in Python. It uses
---
## Try It Out!
Run
```bash
curl -s https://raw.githubusercontent.com/pythonbpf/Python-BPF/refs/heads/master/tools/setup.sh | sudo bash
```
## Installation
Dependencies:
* `bpftool`
* `clang`
* Python ≥ 3.8
@ -61,6 +56,38 @@ pip install pythonbpf pylibbpf
---
## Try It Out!
#### First, generate the vmlinux.py file for your kernel:
- Install the required dependencies:
- On Ubuntu:
```bash
sudo apt-get install bpftool clang
pip install pythonbpf pylibbpf ctypeslib2
```
- Generate the `vmlinux.py` using:
```bash
sudo tools/vmlinux-gen.py
```
- Copy this file to `BCC-Examples/`
#### Next, install requirements for BCC-Examples:
- These requirements are only required for the python notebooks, vfsreadlat and container-monitor examples.
```bash
pip install -r BCC-Examples/requirements.txt
```
- Now, follow the instructions in the [BCC-Examples/README.md](https://github.com/pythonbpf/Python-BPF/blob/master/BCC-Examples/README.md) to run the examples.
#### To spin up jupyter notebook examples:
- Run and follow the instructions on screen
```bash
curl -s https://raw.githubusercontent.com/pythonbpf/Python-BPF/refs/heads/master/tools/setup.sh | sudo bash
```
- Check the jupyter server on the web browser and run the notebooks in the `BCC-Examples/` folder.
---
## Example Usage
```python
@ -88,16 +115,15 @@ def hist() -> HashMap:
@section("tracepoint/syscalls/sys_enter_clone")
def hello(ctx: c_void_p) -> c_int64:
process_id = pid()
one = 1
prev = hist.lookup(process_id)
if prev:
previous_value = prev + 1
print(f"count: {previous_value} with {process_id}")
hist.update(process_id, previous_value)
return c_int64(0)
return 0
else:
hist.update(process_id, one)
return c_int64(0)
hist.update(process_id, 1)
return 0
@bpf

405
blazesym-example/Cargo.lock generated Normal file
View File

@ -0,0 +1,405 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 4
[[package]]
name = "adler2"
version = "2.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "320119579fcad9c21884f5c4861d16174d0e06250625266f50fe6898340abefa"
[[package]]
name = "anstream"
version = "0.6.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "43d5b281e737544384e969a5ccad3f1cdd24b48086a0fc1b2a5262a26b8f4f4a"
dependencies = [
"anstyle",
"anstyle-parse",
"anstyle-query",
"anstyle-wincon",
"colorchoice",
"is_terminal_polyfill",
"utf8parse",
]
[[package]]
name = "anstyle"
version = "1.0.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5192cca8006f1fd4f7237516f40fa183bb07f8fbdfedaa0036de5ea9b0b45e78"
[[package]]
name = "anstyle-parse"
version = "0.2.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4e7644824f0aa2c7b9384579234ef10eb7efb6a0deb83f9630a49594dd9c15c2"
dependencies = [
"utf8parse",
]
[[package]]
name = "anstyle-query"
version = "1.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "40c48f72fd53cd289104fc64099abca73db4166ad86ea0b4341abe65af83dadc"
dependencies = [
"windows-sys",
]
[[package]]
name = "anstyle-wincon"
version = "3.0.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "291e6a250ff86cd4a820112fb8898808a366d8f9f58ce16d1f538353ad55747d"
dependencies = [
"anstyle",
"once_cell_polyfill",
"windows-sys",
]
[[package]]
name = "anyhow"
version = "1.0.100"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a23eb6b1614318a8071c9b2521f36b424b2c83db5eb3a0fead4a6c0809af6e61"
[[package]]
name = "bitflags"
version = "2.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "812e12b5285cc515a9c72a5c1d3b6d46a19dac5acfef5265968c166106e31dd3"
[[package]]
name = "blazesym"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ace0ab71bbe9a25cb82f6d0e513ae11aebd1a38787664475bb2ed5cbe2329736"
dependencies = [
"cpp_demangle",
"gimli",
"libc",
"memmap2",
"miniz_oxide",
"rustc-demangle",
]
[[package]]
name = "cc"
version = "1.2.46"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b97463e1064cb1b1c1384ad0a0b9c8abd0988e2a91f52606c80ef14aadb63e36"
dependencies = [
"find-msvc-tools",
"shlex",
]
[[package]]
name = "cfg-if"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"
[[package]]
name = "cfg_aliases"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "613afe47fcd5fac7ccf1db93babcb082c5994d996f20b8b159f2ad1658eb5724"
[[package]]
name = "clap"
version = "4.5.51"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4c26d721170e0295f191a69bd9a1f93efcdb0aff38684b61ab5750468972e5f5"
dependencies = [
"clap_builder",
"clap_derive",
]
[[package]]
name = "clap_builder"
version = "4.5.51"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "75835f0c7bf681bfd05abe44e965760fea999a5286c6eb2d59883634fd02011a"
dependencies = [
"anstream",
"anstyle",
"clap_lex",
"strsim",
]
[[package]]
name = "clap_derive"
version = "4.5.49"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2a0b5487afeab2deb2ff4e03a807ad1a03ac532ff5a2cee5d86884440c7f7671"
dependencies = [
"heck",
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "clap_lex"
version = "0.7.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a1d728cc89cf3aee9ff92b05e62b19ee65a02b5702cff7d5a377e32c6ae29d8d"
[[package]]
name = "colorchoice"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b05b61dc5112cbb17e4b6cd61790d9845d13888356391624cbe7e41efeac1e75"
[[package]]
name = "cpp_demangle"
version = "0.5.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0667304c32ea56cb4cd6d2d7c0cfe9a2f8041229db8c033af7f8d69492429def"
dependencies = [
"cfg-if",
]
[[package]]
name = "equivalent"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "877a4ace8713b0bcf2a4e7eec82529c029f1d0619886d18145fea96c3ffe5c0f"
[[package]]
name = "fallible-iterator"
version = "0.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2acce4a10f12dc2fb14a218589d4f1f62ef011b2d0cc4b3cb1bba8e94da14649"
[[package]]
name = "find-msvc-tools"
version = "0.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3a3076410a55c90011c298b04d0cfa770b00fa04e1e3c97d3f6c9de105a03844"
[[package]]
name = "gimli"
version = "0.32.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e629b9b98ef3dd8afe6ca2bd0f89306cec16d43d907889945bc5d6687f2f13c7"
dependencies = [
"fallible-iterator",
"indexmap",
"stable_deref_trait",
]
[[package]]
name = "hashbrown"
version = "0.16.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5419bdc4f6a9207fbeba6d11b604d481addf78ecd10c11ad51e76c2f6482748d"
[[package]]
name = "heck"
version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea"
[[package]]
name = "indexmap"
version = "2.12.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6717a8d2a5a929a1a2eb43a12812498ed141a0bcfb7e8f7844fbdbe4303bba9f"
dependencies = [
"equivalent",
"hashbrown",
]
[[package]]
name = "is_terminal_polyfill"
version = "1.70.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a6cb138bb79a146c1bd460005623e142ef0181e3d0219cb493e02f7d08a35695"
[[package]]
name = "libbpf-rs"
version = "0.24.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "93edd9cd673087fa7518fd63ad6c87be2cd9b4e35034b1873f3e3258c018275b"
dependencies = [
"bitflags",
"libbpf-sys",
"libc",
"vsprintf",
]
[[package]]
name = "libbpf-sys"
version = "1.6.2+v1.6.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ba0346fc595fa2c8e274903e8a0e3ed5e6a29183af167567f6289fd3b116881b"
dependencies = [
"cc",
"nix",
"pkg-config",
]
[[package]]
name = "libc"
version = "0.2.177"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2874a2af47a2325c2001a6e6fad9b16a53b802102b528163885171cf92b15976"
[[package]]
name = "memmap2"
version = "0.9.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "744133e4a0e0a658e1374cf3bf8e415c4052a15a111acd372764c55b4177d490"
dependencies = [
"libc",
]
[[package]]
name = "miniz_oxide"
version = "0.8.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1fa76a2c86f704bdb222d66965fb3d63269ce38518b83cb0575fca855ebb6316"
dependencies = [
"adler2",
"simd-adler32",
]
[[package]]
name = "nix"
version = "0.30.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "74523f3a35e05aba87a1d978330aef40f67b0304ac79c1c00b294c9830543db6"
dependencies = [
"bitflags",
"cfg-if",
"cfg_aliases",
"libc",
]
[[package]]
name = "once_cell_polyfill"
version = "1.70.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "384b8ab6d37215f3c5301a95a4accb5d64aa607f1fcb26a11b5303878451b4fe"
[[package]]
name = "pkg-config"
version = "0.3.32"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7edddbd0b52d732b21ad9a5fab5c704c14cd949e5e9a1ec5929a24fded1b904c"
[[package]]
name = "plain"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b4596b6d070b27117e987119b4dac604f3c58cfb0b191112e24771b2faeac1a6"
[[package]]
name = "proc-macro2"
version = "1.0.103"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5ee95bc4ef87b8d5ba32e8b7714ccc834865276eab0aed5c9958d00ec45f49e8"
dependencies = [
"unicode-ident",
]
[[package]]
name = "blazesym-example"
version = "0.1.0"
dependencies = [
"anyhow",
"blazesym",
"clap",
"libbpf-rs",
"libc",
"plain",
]
[[package]]
name = "quote"
version = "1.0.42"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a338cc41d27e6cc6dce6cefc13a0729dfbb81c262b1f519331575dd80ef3067f"
dependencies = [
"proc-macro2",
]
[[package]]
name = "rustc-demangle"
version = "0.1.26"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "56f7d92ca342cea22a06f2121d944b4fd82af56988c270852495420f961d4ace"
[[package]]
name = "shlex"
version = "1.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0fda2ff0d084019ba4d7c6f371c95d8fd75ce3524c3cb8fb653a3023f6323e64"
[[package]]
name = "simd-adler32"
version = "0.3.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d66dc143e6b11c1eddc06d5c423cfc97062865baf299914ab64caa38182078fe"
[[package]]
name = "stable_deref_trait"
version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6ce2be8dc25455e1f91df71bfa12ad37d7af1092ae736f3a6cd0e37bc7810596"
[[package]]
name = "strsim"
version = "0.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f"
[[package]]
name = "syn"
version = "2.0.110"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a99801b5bd34ede4cf3fc688c5919368fea4e4814a4664359503e6015b280aea"
dependencies = [
"proc-macro2",
"quote",
"unicode-ident",
]
[[package]]
name = "unicode-ident"
version = "1.0.22"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9312f7c4f6ff9069b165498234ce8be658059c6728633667c526e27dc2cf1df5"
[[package]]
name = "utf8parse"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821"
[[package]]
name = "vsprintf"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "aec2f81b75ca063294776b4f7e8da71d1d5ae81c2b1b149c8d89969230265d63"
dependencies = [
"cc",
"libc",
]
[[package]]
name = "windows-link"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f0805222e57f7521d6a62e36fa9163bc891acd422f971defe97d64e70d0a4fe5"
[[package]]
name = "windows-sys"
version = "0.61.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ae137229bcbd6cdf0f7b80a31df61766145077ddf49416a728b02cb3921ff3fc"
dependencies = [
"windows-link",
]

View File

@ -0,0 +1,14 @@
[package]
name = "blazesym-example"
version = "0.1.0"
edition = "2024"
[dependencies]
libbpf-rs = "0.24"
blazesym = "0.2.0-rc.4"
anyhow = "1.0"
clap = { version = "4.5", features = ["derive"] }
libc = "0.2"
plain = "0.2"
[build-dependencies]

View File

@ -0,0 +1,333 @@
// src/main.rs - Fixed imports and error handling
use std::mem;
use std::path::PathBuf;
use std::time::Duration;
use anyhow::{anyhow, Context, Result};
use blazesym::symbolize::{CodeInfo, Input, Symbolized, Symbolizer};
use blazesym::symbolize::source::{Source, Kernel, Process};
use clap::Parser;
use libbpf_rs::{MapCore, ObjectBuilder, RingBufferBuilder}; // Added MapCore
// Match your Python struct exactly
#[repr(C)]
#[derive(Debug, Copy, Clone)]
struct ExecEvent {
pid: i64,
cpu: i32,
timestamp: i64,
comm: [u8; 16],
kstack_sz: i64,
ustack_sz: i64,
kstack: [u8; 128], // str(128) in Python
ustack: [u8; 128], // str(128) in Python
}
unsafe impl plain::Plain for ExecEvent {}
// Define perf_event constants (not in libc on all platforms)
const PERF_TYPE_HARDWARE: u32 = 0;
const PERF_TYPE_SOFTWARE: u32 = 1;
const PERF_COUNT_HW_CPU_CYCLES: u64 = 0;
const PERF_COUNT_SW_CPU_CLOCK: u64 = 0;
#[repr(C)]
struct PerfEventAttr {
type_: u32,
size: u32,
config: u64,
sample_period_or_freq: u64,
sample_type: u64,
read_format: u64,
flags: u64,
// ... rest can be zeroed
_padding: [u64; 64],
}
#[derive(Parser, Debug)]
struct Args {
/// Path to the BPF object file
#[arg(default_value = "stack_traces.o")]
object_file: PathBuf,
/// Sampling frequency
#[arg(short, long, default_value_t = 50)]
freq: u64,
/// Use software events
#[arg(long)]
sw_event: bool,
/// Verbose output
#[arg(short, long)]
verbose: bool,
}
fn open_perf_event(cpu: i32, freq: u64, sw_event: bool) -> Result<i32> {
let mut attr: PerfEventAttr = unsafe { mem::zeroed() };
attr.size = mem::size_of::<PerfEventAttr>() as u32;
attr.type_ = if sw_event {
PERF_TYPE_SOFTWARE
} else {
PERF_TYPE_HARDWARE
};
attr.config = if sw_event {
PERF_COUNT_SW_CPU_CLOCK
} else {
PERF_COUNT_HW_CPU_CYCLES
};
// Use frequency-based sampling
attr.sample_period_or_freq = freq;
attr.flags = 1 << 10; // freq = 1, disabled = 1
let fd = unsafe {
libc::syscall(
libc::SYS_perf_event_open,
&attr as *const _,
-1, // pid = -1 (all processes)
cpu, // cpu
-1, // group_fd
0, // flags
)
};
if fd < 0 {
Err(anyhow!("Failed to open perf event on CPU {}: {}", cpu,
std::io::Error::last_os_error()))
} else {
Ok(fd as i32)
}
}
fn print_stack_trace(
addrs: &[u64],
symbolizer: &Symbolizer,
pid: u32,
is_kernel: bool,
) {
if addrs.is_empty() {
return;
}
let src = if is_kernel {
Source::Kernel(Kernel::default())
} else {
Source::Process(Process::new(pid.into()))
};
let syms = match symbolizer.symbolize(&src, Input::AbsAddr(addrs)) {
Ok(syms) => syms,
Err(e) => {
eprintln!(" Failed to symbolize: {}", e);
for addr in addrs {
println!("0x{:016x}: <no-symbol>", addr);
}
return;
}
};
for (addr, sym) in addrs.iter().zip(syms.iter()) {
match sym {
Symbolized::Sym(sym_info) => {
print!("0x{:016x}: {} @ 0x{:x}+0x{:x}",
addr, sym_info.name, sym_info.addr, sym_info.offset);
if let Some(ref code_info) = sym_info.code_info {
print_code_info(code_info);
}
println!();
// Print inlined frames
for inlined in &sym_info.inlined {
print!(" {} (inlined)", inlined.name);
if let Some(ref code_info) = inlined.code_info {
print_code_info(code_info);
}
println!();
}
}
Symbolized::Unknown(..) => {
println!("0x{:016x}: <no-symbol>", addr);
}
}
}
}
fn print_code_info(code_info: &CodeInfo) {
let path = code_info.to_path();
let path_str = path.display();
match (code_info.line, code_info.column) {
(Some(line), Some(col)) => print!(" {}:{}:{}", path_str, line, col),
(Some(line), None) => print!(" {}:{}", path_str, line),
(None, _) => print!(" {}", path_str),
}
}
fn handle_event(symbolizer: &Symbolizer, data: &[u8]) -> i32 {
let event = plain::from_bytes::<ExecEvent>(data).expect("Invalid event data");
// Extract comm string
let comm = std::str::from_utf8(&event.comm)
.unwrap_or("<unknown>")
.trim_end_matches('\0');
println!("[{:.9}] COMM: {} (pid={}) @ CPU {}",
event.timestamp as f64 / 1_000_000_000.0,
comm,
event.pid,
event.cpu);
// Handle kernel stack
if event.kstack_sz > 0 {
println!("Kernel:");
let num_frames = (event.kstack_sz / 8) as usize;
let kstack_u64 = unsafe {
std::slice::from_raw_parts(
event.kstack.as_ptr() as *const u64,
num_frames.min(16),
)
};
// Filter out zero addresses
let kstack: Vec<u64> = kstack_u64.iter()
.copied()
.take_while(|&addr| addr != 0)
.collect();
print_stack_trace(&kstack, symbolizer, 0, true);
} else {
println!("No Kernel Stack");
}
// Handle user stack
if event.ustack_sz > 0 {
println!("Userspace:");
let num_frames = (event.ustack_sz / 8) as usize;
let ustack_u64 = unsafe {
std::slice::from_raw_parts(
event.ustack.as_ptr() as *const u64,
num_frames.min(16),
)
};
// Filter out zero addresses
let ustack: Vec<u64> = ustack_u64.iter()
.copied()
.take_while(|&addr| addr != 0)
.collect();
print_stack_trace(&ustack, symbolizer, event.pid as u32, false);
} else {
println!("No Userspace Stack");
}
println!();
0
}
fn main() -> Result<()> {
let args = Args::parse();
if !args.object_file.exists() {
return Err(anyhow!("Object file not found: {:?}", args.object_file));
}
println!("Loading BPF object: {:?}", args.object_file);
// Load BPF object
let mut obj_builder = ObjectBuilder::default();
obj_builder.debug(args.verbose);
let open_obj = obj_builder
.open_file(&args.object_file)
.context("Failed to open BPF object")?;
let mut obj = open_obj.load().context("Failed to load BPF object")?;
println!("✓ BPF object loaded");
// Find the program
let prog = obj
.progs_mut()
.find(|p| p.name() == "trace_exec_enter")
.ok_or_else(|| anyhow!("Program 'trace_exec_enter' not found"))?;
println!("✓ Found program: trace_exec_enter");
// Find the map
let map = obj
.maps()
.find(|m| m.name() == "exec_events")
.ok_or_else(|| anyhow!("Map 'exec_events' not found"))?;
println!("✓ Found map: exec_events");
// Get number of CPUs
let num_cpus = libbpf_rs::num_possible_cpus()?;
println!("✓ Detected {} CPUs\n", num_cpus);
// Open perf events and attach BPF program
println!("Setting up perf events...");
let mut links = Vec::new();
for cpu in 0..num_cpus {
match open_perf_event(cpu as i32, args.freq, args.sw_event) {
Ok(perf_fd) => {
match prog.attach_perf_event(perf_fd) {
Ok(link) => {
links.push(link);
if args.verbose {
println!(" ✓ Attached to CPU {}", cpu);
}
}
Err(e) => {
eprintln!(" ✗ Failed to attach to CPU {}: {}", cpu, e);
unsafe { libc::close(perf_fd); }
}
}
}
Err(e) => {
if args.verbose {
eprintln!(" ✗ Failed to open perf event on CPU {}: {}", cpu, e);
}
}
}
}
println!("✓ Attached to {} CPUs\n", links.len());
if links.is_empty() {
return Err(anyhow!("Failed to attach to any CPU"));
}
// Initialize symbolizer
let symbolizer = Symbolizer::new();
// Set up ring buffer
let mut builder = RingBufferBuilder::new();
builder.add(&map, move |data: &[u8]| -> i32 {
handle_event(&symbolizer, data)
})?;
let ringbuf = builder.build()?;
println!("========================================");
println!("Profiling started. Press Ctrl+C to stop.");
println!("========================================\n");
// Poll for events - just keep polling until error
loop {
if let Err(e) = ringbuf.poll(Duration::from_millis(100)) {
// Any error breaks the loop (including Ctrl+C)
eprintln!("\nStopping: {}", e);
break;
}
}
println!("Done.");
Ok(())
}

View File

@ -0,0 +1,49 @@
# tests/passing_tests/ringbuf_advanced.py
from pythonbpf import bpf, map, section, bpfglobal, struct, compile
from pythonbpf.maps import RingBuffer
from pythonbpf.helper import ktime, pid, smp_processor_id, comm, get_stack
from ctypes import c_void_p, c_int32, c_int64
import logging
@bpf
@struct
class exec_event:
pid: c_int64
cpu: c_int32
timestamp: c_int64
comm: str(16) # type: ignore [valid-type]
kstack_sz: c_int64
ustack_sz: c_int64
kstack: str(128) # type: ignore [valid-type]
ustack: str(128) # type: ignore [valid-type]
@bpf
@map
def exec_events() -> RingBuffer:
return RingBuffer(max_entries=1048576)
@bpf
@section("perf_event")
def trace_exec_enter(ctx: c_void_p) -> c_int64:
evt = exec_event()
evt.pid = pid()
evt.cpu = smp_processor_id()
evt.timestamp = ktime()
comm(evt.comm)
evt.kstack_sz = get_stack(evt.kstack)
evt.ustack_sz = get_stack(evt.ustack, 256)
exec_events.output(evt)
print(f"Submitted exec_event for pid: {evt.pid}, cpu: {evt.cpu}")
return 0 # type: ignore [return-value]
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile(logging.INFO)

View File

@ -0,0 +1,22 @@
"""
Process Anomaly Detection - Constants and Utilities
"""
import logging
logger = logging.getLogger(__name__)
MAX_SYSCALLS = 548
def comm_for_pid(pid: int) -> bytes | None:
"""Get process name from /proc."""
try:
with open(f"/proc/{pid}/comm", "rb") as f:
return f.read().strip()
except FileNotFoundError:
logger.warning(f"Process with PID {pid} not found.")
except PermissionError:
logger.warning(f"Permission denied when accessing /proc/{pid}/comm.")
except Exception as e:
logger.warning(f"Error reading /proc/{pid}/comm: {e}")
return None

View File

@ -0,0 +1,173 @@
"""
Autoencoder for Process Behavior Anomaly Detection
Uses Keras/TensorFlow to train an autoencoder on syscall patterns.
Anomalies are detected when reconstruction error exceeds threshold.
"""
import logging
import os
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from tensorflow import keras
from lib import MAX_SYSCALLS
logger = logging.getLogger(__name__)
def create_autoencoder(n_inputs: int = MAX_SYSCALLS) -> keras.Model:
"""
Create the autoencoder architecture.
Architecture: input → encoder → bottleneck → decoder → output
"""
inp = keras.Input(shape=(n_inputs,))
# Encoder
encoder = keras.layers.Dense(n_inputs)(inp)
encoder = keras.layers.ReLU()(encoder)
# Bottleneck (compressed representation)
bottleneck = keras.layers.Dense(n_inputs // 2)(encoder)
# Decoder
decoder = keras.layers.Dense(n_inputs)(bottleneck)
decoder = keras.layers.ReLU()(decoder)
output = keras.layers.Dense(n_inputs, activation="linear")(decoder)
model = keras.Model(inp, output)
model.compile(optimizer="adam", loss="mse")
return model
class AutoEncoder:
"""
Autoencoder for syscall pattern anomaly detection.
Usage:
# Training
ae = AutoEncoder('model.keras')
model, threshold = ae.train('data.csv', epochs=200)
# Inference
ae = AutoEncoder('model.keras', load=True)
_, errors, total_error = ae.predict([features])
"""
def __init__(self, filename: str, load: bool = False):
self.filename = filename
self.model = None
if load:
self._load_model()
def _load_model(self) -> None:
"""Load a trained model from disk."""
if not os.path.exists(self.filename):
raise FileNotFoundError(f"Model file not found: {self.filename}")
logger.info(f"Loading model from {self.filename}")
self.model = keras.models.load_model(self.filename)
def train(
self,
datafile: str,
epochs: int,
batch_size: int,
test_size: float = 0.1,
) -> tuple[keras.Model, float]:
"""
Train the autoencoder on collected data.
Args:
datafile: Path to CSV file with training data
epochs: Number of training epochs
batch_size: Training batch size
test_size: Fraction of data to use for validation
Returns:
Tuple of (trained model, error threshold)
"""
if not os.path.exists(datafile):
raise FileNotFoundError(f"Data file not found: {datafile}")
logger.info(f"Loading training data from {datafile}")
# Load and prepare data
df = pd.read_csv(datafile)
features = df.drop(["sample_time"], axis=1).values
logger.info(f"Loaded {len(features)} samples with {features.shape[1]} features")
# Split train/test
train_data, test_data = train_test_split(
features,
test_size=test_size,
random_state=42,
)
logger.info(f"Training set: {len(train_data)} samples")
logger.info(f"Test set: {len(test_data)} samples")
# Create and train model
self.model = create_autoencoder()
if self.model is None:
raise RuntimeError("Failed to create the autoencoder model.")
logger.info("Training autoencoder...")
self.model.fit(
train_data,
train_data,
validation_data=(test_data, test_data),
epochs=epochs,
batch_size=batch_size,
verbose=1,
)
# Save model (use .keras format for Keras 3.x compatibility)
self.model.save(self.filename)
logger.info(f"Model saved to {self.filename}")
# Calculate error threshold from test data
threshold = self._calculate_threshold(test_data)
return self.model, threshold
def _calculate_threshold(self, test_data: np.ndarray) -> float:
"""Calculate error threshold from test data."""
logger.info(f"Calculating error threshold from {len(test_data)} test samples")
if self.model is None:
raise RuntimeError("Model not loaded. Use load=True or train first.")
predictions = self.model.predict(test_data, verbose=0)
errors = np.abs(test_data - predictions).sum(axis=1)
return float(errors.max())
def predict(self, X: list | np.ndarray) -> tuple[np.ndarray, np.ndarray, float]:
"""
Run prediction and return reconstruction error.
Args:
X: Input data (list of feature vectors)
Returns:
Tuple of (reconstructed, per_feature_errors, total_error)
"""
if self.model is None:
raise RuntimeError("Model not loaded. Use load=True or train first.")
X = np.asarray(X, dtype=np.float32)
y = self.model.predict(X, verbose=0)
# Per-feature reconstruction error
errors = np.abs(X[0] - y[0])
total_error = float(errors.sum())
return y, errors, total_error

View File

@ -0,0 +1,448 @@
# Copyright 2017 Sasha Goldshtein
# Copyright 2018 Red Hat, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
syscall.py contains functions useful for mapping between syscall names and numbers
"""
# Syscall table for Linux x86_64, not very recent. Automatically generated from
# https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/x86/entry/syscalls/syscall_64.tbl?h=linux-6.17.y
# using the following command:
#
# cat arch/x86/entry/syscalls/syscall_64.tbl \
# | awk 'BEGIN { print "syscalls = {" }
# /^[0-9]/ { print " "$1": b\""$3"\"," }
# END { print "}" }'
SYSCALLS = {
0: b"read",
1: b"write",
2: b"open",
3: b"close",
4: b"stat",
5: b"fstat",
6: b"lstat",
7: b"poll",
8: b"lseek",
9: b"mmap",
10: b"mprotect",
11: b"munmap",
12: b"brk",
13: b"rt_sigaction",
14: b"rt_sigprocmask",
15: b"rt_sigreturn",
16: b"ioctl",
17: b"pread64",
18: b"pwrite64",
19: b"readv",
20: b"writev",
21: b"access",
22: b"pipe",
23: b"select",
24: b"sched_yield",
25: b"mremap",
26: b"msync",
27: b"mincore",
28: b"madvise",
29: b"shmget",
30: b"shmat",
31: b"shmctl",
32: b"dup",
33: b"dup2",
34: b"pause",
35: b"nanosleep",
36: b"getitimer",
37: b"alarm",
38: b"setitimer",
39: b"getpid",
40: b"sendfile",
41: b"socket",
42: b"connect",
43: b"accept",
44: b"sendto",
45: b"recvfrom",
46: b"sendmsg",
47: b"recvmsg",
48: b"shutdown",
49: b"bind",
50: b"listen",
51: b"getsockname",
52: b"getpeername",
53: b"socketpair",
54: b"setsockopt",
55: b"getsockopt",
56: b"clone",
57: b"fork",
58: b"vfork",
59: b"execve",
60: b"exit",
61: b"wait4",
62: b"kill",
63: b"uname",
64: b"semget",
65: b"semop",
66: b"semctl",
67: b"shmdt",
68: b"msgget",
69: b"msgsnd",
70: b"msgrcv",
71: b"msgctl",
72: b"fcntl",
73: b"flock",
74: b"fsync",
75: b"fdatasync",
76: b"truncate",
77: b"ftruncate",
78: b"getdents",
79: b"getcwd",
80: b"chdir",
81: b"fchdir",
82: b"rename",
83: b"mkdir",
84: b"rmdir",
85: b"creat",
86: b"link",
87: b"unlink",
88: b"symlink",
89: b"readlink",
90: b"chmod",
91: b"fchmod",
92: b"chown",
93: b"fchown",
94: b"lchown",
95: b"umask",
96: b"gettimeofday",
97: b"getrlimit",
98: b"getrusage",
99: b"sysinfo",
100: b"times",
101: b"ptrace",
102: b"getuid",
103: b"syslog",
104: b"getgid",
105: b"setuid",
106: b"setgid",
107: b"geteuid",
108: b"getegid",
109: b"setpgid",
110: b"getppid",
111: b"getpgrp",
112: b"setsid",
113: b"setreuid",
114: b"setregid",
115: b"getgroups",
116: b"setgroups",
117: b"setresuid",
118: b"getresuid",
119: b"setresgid",
120: b"getresgid",
121: b"getpgid",
122: b"setfsuid",
123: b"setfsgid",
124: b"getsid",
125: b"capget",
126: b"capset",
127: b"rt_sigpending",
128: b"rt_sigtimedwait",
129: b"rt_sigqueueinfo",
130: b"rt_sigsuspend",
131: b"sigaltstack",
132: b"utime",
133: b"mknod",
134: b"uselib",
135: b"personality",
136: b"ustat",
137: b"statfs",
138: b"fstatfs",
139: b"sysfs",
140: b"getpriority",
141: b"setpriority",
142: b"sched_setparam",
143: b"sched_getparam",
144: b"sched_setscheduler",
145: b"sched_getscheduler",
146: b"sched_get_priority_max",
147: b"sched_get_priority_min",
148: b"sched_rr_get_interval",
149: b"mlock",
150: b"munlock",
151: b"mlockall",
152: b"munlockall",
153: b"vhangup",
154: b"modify_ldt",
155: b"pivot_root",
156: b"_sysctl",
157: b"prctl",
158: b"arch_prctl",
159: b"adjtimex",
160: b"setrlimit",
161: b"chroot",
162: b"sync",
163: b"acct",
164: b"settimeofday",
165: b"mount",
166: b"umount2",
167: b"swapon",
168: b"swapoff",
169: b"reboot",
170: b"sethostname",
171: b"setdomainname",
172: b"iopl",
173: b"ioperm",
174: b"create_module",
175: b"init_module",
176: b"delete_module",
177: b"get_kernel_syms",
178: b"query_module",
179: b"quotactl",
180: b"nfsservctl",
181: b"getpmsg",
182: b"putpmsg",
183: b"afs_syscall",
184: b"tuxcall",
185: b"security",
186: b"gettid",
187: b"readahead",
188: b"setxattr",
189: b"lsetxattr",
190: b"fsetxattr",
191: b"getxattr",
192: b"lgetxattr",
193: b"fgetxattr",
194: b"listxattr",
195: b"llistxattr",
196: b"flistxattr",
197: b"removexattr",
198: b"lremovexattr",
199: b"fremovexattr",
200: b"tkill",
201: b"time",
202: b"futex",
203: b"sched_setaffinity",
204: b"sched_getaffinity",
205: b"set_thread_area",
206: b"io_setup",
207: b"io_destroy",
208: b"io_getevents",
209: b"io_submit",
210: b"io_cancel",
211: b"get_thread_area",
212: b"lookup_dcookie",
213: b"epoll_create",
214: b"epoll_ctl_old",
215: b"epoll_wait_old",
216: b"remap_file_pages",
217: b"getdents64",
218: b"set_tid_address",
219: b"restart_syscall",
220: b"semtimedop",
221: b"fadvise64",
222: b"timer_create",
223: b"timer_settime",
224: b"timer_gettime",
225: b"timer_getoverrun",
226: b"timer_delete",
227: b"clock_settime",
228: b"clock_gettime",
229: b"clock_getres",
230: b"clock_nanosleep",
231: b"exit_group",
232: b"epoll_wait",
233: b"epoll_ctl",
234: b"tgkill",
235: b"utimes",
236: b"vserver",
237: b"mbind",
238: b"set_mempolicy",
239: b"get_mempolicy",
240: b"mq_open",
241: b"mq_unlink",
242: b"mq_timedsend",
243: b"mq_timedreceive",
244: b"mq_notify",
245: b"mq_getsetattr",
246: b"kexec_load",
247: b"waitid",
248: b"add_key",
249: b"request_key",
250: b"keyctl",
251: b"ioprio_set",
252: b"ioprio_get",
253: b"inotify_init",
254: b"inotify_add_watch",
255: b"inotify_rm_watch",
256: b"migrate_pages",
257: b"openat",
258: b"mkdirat",
259: b"mknodat",
260: b"fchownat",
261: b"futimesat",
262: b"newfstatat",
263: b"unlinkat",
264: b"renameat",
265: b"linkat",
266: b"symlinkat",
267: b"readlinkat",
268: b"fchmodat",
269: b"faccessat",
270: b"pselect6",
271: b"ppoll",
272: b"unshare",
273: b"set_robust_list",
274: b"get_robust_list",
275: b"splice",
276: b"tee",
277: b"sync_file_range",
278: b"vmsplice",
279: b"move_pages",
280: b"utimensat",
281: b"epoll_pwait",
282: b"signalfd",
283: b"timerfd_create",
284: b"eventfd",
285: b"fallocate",
286: b"timerfd_settime",
287: b"timerfd_gettime",
288: b"accept4",
289: b"signalfd4",
290: b"eventfd2",
291: b"epoll_create1",
292: b"dup3",
293: b"pipe2",
294: b"inotify_init1",
295: b"preadv",
296: b"pwritev",
297: b"rt_tgsigqueueinfo",
298: b"perf_event_open",
299: b"recvmmsg",
300: b"fanotify_init",
301: b"fanotify_mark",
302: b"prlimit64",
303: b"name_to_handle_at",
304: b"open_by_handle_at",
305: b"clock_adjtime",
306: b"syncfs",
307: b"sendmmsg",
308: b"setns",
309: b"getcpu",
310: b"process_vm_readv",
311: b"process_vm_writev",
312: b"kcmp",
313: b"finit_module",
314: b"sched_setattr",
315: b"sched_getattr",
316: b"renameat2",
317: b"seccomp",
318: b"getrandom",
319: b"memfd_create",
320: b"kexec_file_load",
321: b"bpf",
322: b"execveat",
323: b"userfaultfd",
324: b"membarrier",
325: b"mlock2",
326: b"copy_file_range",
327: b"preadv2",
328: b"pwritev2",
329: b"pkey_mprotect",
330: b"pkey_alloc",
331: b"pkey_free",
332: b"statx",
333: b"io_pgetevents",
334: b"rseq",
335: b"uretprobe",
424: b"pidfd_send_signal",
425: b"io_uring_setup",
426: b"io_uring_enter",
427: b"io_uring_register",
428: b"open_tree",
429: b"move_mount",
430: b"fsopen",
431: b"fsconfig",
432: b"fsmount",
433: b"fspick",
434: b"pidfd_open",
435: b"clone3",
436: b"close_range",
437: b"openat2",
438: b"pidfd_getfd",
439: b"faccessat2",
440: b"process_madvise",
441: b"epoll_pwait2",
442: b"mount_setattr",
443: b"quotactl_fd",
444: b"landlock_create_ruleset",
445: b"landlock_add_rule",
446: b"landlock_restrict_self",
447: b"memfd_secret",
448: b"process_mrelease",
449: b"futex_waitv",
450: b"set_mempolicy_home_node",
451: b"cachestat",
452: b"fchmodat2",
453: b"map_shadow_stack",
454: b"futex_wake",
455: b"futex_wait",
456: b"futex_requeue",
457: b"statmount",
458: b"listmount",
459: b"lsm_get_self_attr",
460: b"lsm_set_self_attr",
461: b"lsm_list_modules",
462: b"mseal",
463: b"setxattrat",
464: b"getxattrat",
465: b"listxattrat",
466: b"removexattrat",
467: b"open_tree_attr",
468: b"file_getattr",
469: b"file_setattr",
512: b"rt_sigaction",
513: b"rt_sigreturn",
514: b"ioctl",
515: b"readv",
516: b"writev",
517: b"recvfrom",
518: b"sendmsg",
519: b"recvmsg",
520: b"execve",
521: b"ptrace",
522: b"rt_sigpending",
523: b"rt_sigtimedwait",
524: b"rt_sigqueueinfo",
525: b"sigaltstack",
526: b"timer_create",
527: b"mq_notify",
528: b"kexec_load",
529: b"waitid",
530: b"set_robust_list",
531: b"get_robust_list",
532: b"vmsplice",
533: b"move_pages",
534: b"preadv",
535: b"pwritev",
536: b"rt_tgsigqueueinfo",
537: b"recvmmsg",
538: b"sendmmsg",
539: b"process_vm_readv",
540: b"process_vm_writev",
541: b"setsockopt",
542: b"getsockopt",
543: b"io_setup",
544: b"io_submit",
545: b"execveat",
546: b"preadv2",
547: b"pwritev2",
}

View File

@ -0,0 +1,117 @@
"""
PythonBPF eBPF Probe for Syscall Histogram Collection
"""
from vmlinux import struct_trace_event_raw_sys_enter
from pythonbpf import bpf, map, section, bpfglobal, BPF
from pythonbpf.helper import pid
from pythonbpf.maps import HashMap
from ctypes import c_int64
from lib import MAX_SYSCALLS, comm_for_pid
@bpf
@map
def histogram() -> HashMap:
return HashMap(key=c_int64, value=c_int64, max_entries=1024)
@bpf
@map
def target_pid_map() -> HashMap:
return HashMap(key=c_int64, value=c_int64, max_entries=1)
@bpf
@section("tracepoint/raw_syscalls/sys_enter")
def trace_syscall(ctx: struct_trace_event_raw_sys_enter) -> c_int64:
syscall_id = ctx.id
current_pid = pid()
target = target_pid_map.lookup(0)
if target:
if current_pid != target:
return 0 # type: ignore
if syscall_id < 0 or syscall_id >= 548:
return 0 # type: ignore
count = histogram.lookup(syscall_id)
if count:
histogram.update(syscall_id, count + 1)
else:
histogram.update(syscall_id, 1)
return 0 # type: ignore
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
ebpf_prog = BPF()
class Probe:
"""
Syscall histogram probe for a target process.
Usage:
probe = Probe(target_pid=1234)
probe.start()
histogram = probe.get_histogram()
"""
def __init__(self, target_pid: int, max_syscalls: int = MAX_SYSCALLS):
self.target_pid = target_pid
self.max_syscalls = max_syscalls
self.comm = comm_for_pid(target_pid)
if self.comm is None:
raise ValueError(f"Cannot find process with PID {target_pid}")
self._bpf = None
self._histogram_map = None
self._target_map = None
def start(self):
"""Compile, load, and attach the BPF probe."""
# Compile and load
self._bpf = ebpf_prog
self._bpf.load()
self._bpf.attach_all()
# Get map references
self._histogram_map = self._bpf["histogram"]
self._target_map = self._bpf["target_pid_map"]
# Set target PID in the map
self._target_map.update(0, self.target_pid)
return self
def get_histogram(self) -> list:
"""Read current histogram values as a list."""
if self._histogram_map is None:
raise RuntimeError("Probe not started. Call start() first.")
result = [0] * self.max_syscalls
for syscall_id in range(self.max_syscalls):
try:
count = self._histogram_map.lookup(syscall_id)
if count is not None:
result[syscall_id] = int(count)
except Exception:
pass
return result
def __getitem__(self, syscall_id: int) -> int:
"""Allow indexing: probe[syscall_id]"""
if self._histogram_map is None:
raise RuntimeError("Probe not started")
try:
count = self._histogram_map.lookup(syscall_id)
return int(count) if count is not None else 0
except Exception:
return 0

View File

@ -0,0 +1,335 @@
#!/usr/bin/env python3
"""
Process Behavior Anomaly Detection using PythonBPF and Autoencoders
Ported from evilsocket's BCC implementation to PythonBPF.
https://github.com/evilsocket/ebpf-process-anomaly-detection
Usage:
# 1.Learn normal behavior from a process
sudo python main.py --learn --pid 1234 --data normal.csv
# 2.Train the autoencoder (no sudo needed)
python main.py --train --data normal.csv --model model.h5
# 3.Monitor for anomalies
sudo python main.py --run --pid 1234 --model model.h5
"""
import argparse
import logging
import os
import sys
import time
from collections import Counter
from lib import MAX_SYSCALLS
from lib.ml import AutoEncoder
from lib.platform import SYSCALLS
from lib.probe import Probe
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
)
logger = logging.getLogger(__name__)
def learn(pid: int, data_path: str, poll_interval_ms: int) -> None:
"""
Capture syscall patterns from target process.
Args:
pid: Target process ID
data_path: Path to save CSV data
poll_interval_ms: Polling interval in milliseconds
"""
if os.path.exists(data_path):
logger.error(
f"{data_path} already exists.Delete it or use a different filename."
)
sys.exit(1)
try:
probe = Probe(pid)
except ValueError as e:
logger.error(str(e))
sys.exit(1)
probe_comm = probe.comm.decode() if probe.comm else "unknown"
print(f"📊 Learning from process {pid} ({probe_comm})")
print(f"📁 Saving data to {data_path}")
print(f"⏱️ Polling interval: {poll_interval_ms}ms")
print("Press Ctrl+C to stop...\n")
probe.start()
prev_histogram = [0.0] * MAX_SYSCALLS
prev_report_time = time.time()
sample_count = 0
poll_interval_sec = poll_interval_ms / 1000.0
header = "sample_time," + ",".join(f"sys_{i}" for i in range(MAX_SYSCALLS))
with open(data_path, "w") as fp:
fp.write(header + "\n")
try:
while True:
histogram = [float(x) for x in probe.get_histogram()]
if histogram != prev_histogram:
deltas = _compute_deltas(prev_histogram, histogram)
prev_histogram = histogram.copy()
row = f"{time.time()},{','.join(map(str, deltas))}"
fp.write(row + "\n")
fp.flush()
sample_count += 1
now = time.time()
if now - prev_report_time >= 1.0:
print(f" {sample_count} samples saved...")
prev_report_time = now
time.sleep(poll_interval_sec)
except KeyboardInterrupt:
print(f"\n✅ Stopped. Saved {sample_count} samples to {data_path}")
def train(data_path: str, model_path: str, epochs: int, batch_size: int) -> None:
"""
Train autoencoder on captured data.
Args:
data_path: Path to training CSV data
model_path: Path to save trained model
epochs: Number of training epochs
batch_size: Training batch size
"""
if not os.path.exists(data_path):
logger.error(f"Data file {data_path} not found.Run --learn first.")
sys.exit(1)
print(f"🧠 Training autoencoder on {data_path}")
print(f" Epochs: {epochs}")
print(f" Batch size: {batch_size}")
print()
ae = AutoEncoder(model_path)
_, threshold = ae.train(data_path, epochs, batch_size)
print()
print("=" * 50)
print("✅ Training complete!")
print(f" Model saved to: {model_path}")
print(f" Error threshold: {threshold:.6f}")
print()
print(f"💡 Use --max-error {threshold:.4f} when running detection")
print("=" * 50)
def run(pid: int, model_path: str, max_error: float, poll_interval_ms: int) -> None:
"""
Monitor process and detect anomalies.
Args:
pid: Target process ID
model_path: Path to trained model
max_error: Anomaly detection threshold
poll_interval_ms: Polling interval in milliseconds
"""
if not os.path.exists(model_path):
logger.error(f"Model file {model_path} not found. Run --train first.")
sys.exit(1)
try:
probe = Probe(pid)
except ValueError as e:
logger.error(str(e))
sys.exit(1)
ae = AutoEncoder(model_path, load=True)
probe_comm = probe.comm.decode() if probe.comm else "unknown"
print(f"🔍 Monitoring process {pid} ({probe_comm}) for anomalies")
print(f" Error threshold: {max_error}")
print(f" Polling interval: {poll_interval_ms}ms")
print("Press Ctrl+C to stop...\n")
probe.start()
prev_histogram = [0.0] * MAX_SYSCALLS
anomaly_count = 0
check_count = 0
poll_interval_sec = poll_interval_ms / 1000.0
try:
while True:
histogram = [float(x) for x in probe.get_histogram()]
if histogram != prev_histogram:
deltas = _compute_deltas(prev_histogram, histogram)
prev_histogram = histogram.copy()
check_count += 1
_, feat_errors, total_error = ae.predict([deltas])
if total_error > max_error:
anomaly_count += 1
_report_anomaly(anomaly_count, total_error, max_error, feat_errors)
time.sleep(poll_interval_sec)
except KeyboardInterrupt:
print("\n✅ Stopped.")
print(f" Checks performed: {check_count}")
print(f" Anomalies detected: {anomaly_count}")
def _compute_deltas(prev: list[float], current: list[float]) -> list[float]:
"""Compute rate of change between two histograms."""
deltas = []
for p, c in zip(prev, current):
if c != 0.0:
delta = 1.0 - (p / c)
else:
delta = 0.0
deltas.append(delta)
return deltas
def _report_anomaly(
count: int,
total_error: float,
threshold: float,
feat_errors: list[float],
) -> None:
"""Print anomaly report with top offending syscalls."""
print(f"🚨 ANOMALY #{count} detected!")
print(f" Total error: {total_error:.4f} (threshold: {threshold})")
errors_by_syscall = {idx: err for idx, err in enumerate(feat_errors)}
top3 = Counter(errors_by_syscall).most_common(3)
print(" Top anomalous syscalls:")
for idx, err in top3:
name = SYSCALLS.get(idx, f"syscall_{idx}")
print(f"{name!r}: {err:.4f}")
print()
def parse_args() -> argparse.Namespace:
"""Parse command line arguments."""
parser = argparse.ArgumentParser(
description="Process anomaly detection with PythonBPF and Autoencoders",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Learn from a process (e.g., Firefox) for a few minutes
sudo python main.py --learn --pid $(pgrep -o firefox) --data firefox.csv
# Train the model (no sudo needed)
python main.py --train --data firefox.csv --model firefox.h5
# Monitor the same process for anomalies
sudo python main.py --run --pid $(pgrep -o firefox) --model firefox.h5
# Full workflow for nginx:
sudo python main.py --learn --pid $(pgrep -o nginx) --data nginx_normal.csv
python main.py --train --data nginx_normal.csv --model nginx.h5 --epochs 100
sudo python main.py --run --pid $(pgrep -o nginx) --model nginx.h5 --max-error 0.05
""",
)
actions = parser.add_mutually_exclusive_group()
actions.add_argument(
"--learn",
action="store_true",
help="Capture syscall patterns from a process",
)
actions.add_argument(
"--train",
action="store_true",
help="Train autoencoder on captured data",
)
actions.add_argument(
"--run",
action="store_true",
help="Monitor process for anomalies",
)
parser.add_argument(
"--pid",
type=int,
default=0,
help="Target process ID",
)
parser.add_argument(
"--data",
default="data.csv",
help="CSV file for training data (default: data.csv)",
)
parser.add_argument(
"--model",
default="model.keras",
help="Model file path (default: model.h5)",
)
parser.add_argument(
"--time",
type=int,
default=100,
help="Polling interval in milliseconds (default: 100)",
)
parser.add_argument(
"--epochs",
type=int,
default=200,
help="Training epochs (default: 200)",
)
parser.add_argument(
"--batch-size",
type=int,
default=16,
help="Training batch size (default: 16)",
)
parser.add_argument(
"--max-error",
type=float,
default=0.09,
help="Anomaly detection threshold (default: 0.09)",
)
return parser.parse_args()
def main() -> None:
"""Main entry point."""
args = parse_args()
if not any([args.learn, args.train, args.run]):
print("No action specified.Use --learn, --train, or --run.")
print("Run with --help for usage information.")
sys.exit(0)
if args.learn:
if args.pid == 0:
logger.error("--pid required for --learn")
sys.exit(1)
learn(args.pid, args.data, args.time)
elif args.train:
train(args.data, args.model, args.epochs, args.batch_size)
elif args.run:
if args.pid == 0:
logger.error("--pid required for --run")
sys.exit(1)
run(args.pid, args.model, args.max_error, args.time)
if __name__ == "__main__":
main()

View File

@ -21,17 +21,17 @@ def last() -> HashMap:
@section("tracepoint/syscalls/sys_enter_execve")
def do_trace(ctx: c_void_p) -> c_int64:
key = 0
tsp = last().lookup(key)
tsp = last.lookup(key)
if tsp:
kt = ktime()
delta = kt - tsp
if delta < 1000000000:
time_ms = delta // 1000000
print(f"Execve syscall entered within last second, last {time_ms} ms ago")
last().delete(key)
last.delete(key)
else:
kt = ktime()
last().update(key, kt)
last.update(key, kt)
return c_int64(0)

File diff suppressed because one or more lines are too long

View File

@ -3,7 +3,6 @@ import time
from pythonbpf import bpf, map, section, bpfglobal, BPF
from pythonbpf.helper import pid
from pythonbpf.maps import HashMap
from pylibbpf import BpfMap
from ctypes import c_void_p, c_int64, c_uint64, c_int32
import matplotlib.pyplot as plt
@ -26,14 +25,14 @@ def hist() -> HashMap:
def hello(ctx: c_void_p) -> c_int64:
process_id = pid()
one = 1
prev = hist().lookup(process_id)
prev = hist.lookup(process_id)
if prev:
previous_value = prev + 1
print(f"count: {previous_value} with {process_id}")
hist().update(process_id, previous_value)
hist.update(process_id, previous_value)
return c_int64(0)
else:
hist().update(process_id, one)
hist.update(process_id, one)
return c_int64(0)
@ -44,12 +43,12 @@ def LICENSE() -> str:
b = BPF()
b.load_and_attach()
hist = BpfMap(b, hist)
b.load()
b.attach_all()
print("Recording")
time.sleep(10)
counts = list(hist.values())
counts = list(b["hist"].values())
plt.hist(counts, bins=20)
plt.xlabel("Clone calls per PID")

View File

@ -0,0 +1,49 @@
# Container Monitor TUI
A beautiful terminal-based container monitoring tool that combines syscall tracking, file I/O monitoring, and network traffic analysis using eBPF.
## Features
- 🎯 **Interactive Cgroup Selection** - Navigate and select cgroups with arrow keys
- 📊 **Real-time Monitoring** - Live graphs and statistics
- 🔥 **Syscall Tracking** - Total syscall count per cgroup
- 💾 **File I/O Monitoring** - Read/write operations and bytes with graphs
- 🌐 **Network Traffic** - RX/TX packets and bytes with live graphs
-**Efficient Caching** - Reduced /proc lookups for better performance
- 🎨 **Beautiful TUI** - Clean, colorful terminal interface
## Requirements
- Python 3.7+
- pythonbpf
- Root privileges (for eBPF)
## Installation
```bash
# Ensure you have pythonbpf installed
pip install pythonbpf
# Run the monitor
sudo $(which python) container_monitor.py
```
## Usage
1. **Selection Screen**: Use ↑↓ arrow keys to navigate through cgroups, press ENTER to select
2. **Monitoring Screen**: View real-time graphs and statistics, press ESC or 'b' to go back
3. **Exit**: Press 'q' at any time to quit
## Architecture
- `container_monitor.py` - Main BPF program combining all three tracers
- `data_collector.py` - Data collection, caching, and history management
- `tui. py` - Terminal user interface with selection and monitoring screens
## BPF Programs
- **vfs_read/vfs_write** - Track file I/O operations
- **__netif_receive_skb/__dev_queue_xmit** - Track network traffic
- **raw_syscalls/sys_enter** - Count all syscalls
All programs filter by cgroup ID for per-container monitoring.

View File

@ -0,0 +1,220 @@
"""Container Monitor - TUI-based cgroup monitoring combining syscall, file I/O, and network tracking."""
from pythonbpf import bpf, map, section, bpfglobal, struct, BPF
from pythonbpf.maps import HashMap
from pythonbpf.helper import get_current_cgroup_id
from ctypes import c_int32, c_uint64, c_void_p
from vmlinux import struct_pt_regs, struct_sk_buff
from data_collection import ContainerDataCollector
from tui import ContainerMonitorTUI
# ==================== BPF Structs ====================
@bpf
@struct
class read_stats:
bytes: c_uint64
ops: c_uint64
@bpf
@struct
class write_stats:
bytes: c_uint64
ops: c_uint64
@bpf
@struct
class net_stats:
rx_packets: c_uint64
tx_packets: c_uint64
rx_bytes: c_uint64
tx_bytes: c_uint64
# ==================== BPF Maps ====================
@bpf
@map
def read_map() -> HashMap:
return HashMap(key=c_uint64, value=read_stats, max_entries=1024)
@bpf
@map
def write_map() -> HashMap:
return HashMap(key=c_uint64, value=write_stats, max_entries=1024)
@bpf
@map
def net_stats_map() -> HashMap:
return HashMap(key=c_uint64, value=net_stats, max_entries=1024)
@bpf
@map
def syscall_count() -> HashMap:
return HashMap(key=c_uint64, value=c_uint64, max_entries=1024)
# ==================== File I/O Tracing ====================
@bpf
@section("kprobe/vfs_read")
def trace_read(ctx: struct_pt_regs) -> c_int32:
cg = get_current_cgroup_id()
count = c_uint64(ctx.dx)
ptr = read_map.lookup(cg)
if ptr:
s = read_stats()
s.bytes = ptr.bytes + count
s.ops = ptr.ops + 1
read_map.update(cg, s)
else:
s = read_stats()
s.bytes = count
s.ops = c_uint64(1)
read_map.update(cg, s)
return c_int32(0)
@bpf
@section("kprobe/vfs_write")
def trace_write(ctx1: struct_pt_regs) -> c_int32:
cg = get_current_cgroup_id()
count = c_uint64(ctx1.dx)
ptr = write_map.lookup(cg)
if ptr:
s = write_stats()
s.bytes = ptr.bytes + count
s.ops = ptr.ops + 1
write_map.update(cg, s)
else:
s = write_stats()
s.bytes = count
s.ops = c_uint64(1)
write_map.update(cg, s)
return c_int32(0)
# ==================== Network I/O Tracing ====================
@bpf
@section("kprobe/__netif_receive_skb")
def trace_netif_rx(ctx2: struct_pt_regs) -> c_int32:
cgroup_id = get_current_cgroup_id()
skb = struct_sk_buff(ctx2.di)
pkt_len = c_uint64(skb.len)
stats_ptr = net_stats_map.lookup(cgroup_id)
if stats_ptr:
stats = net_stats()
stats.rx_packets = stats_ptr.rx_packets + 1
stats.tx_packets = stats_ptr.tx_packets
stats.rx_bytes = stats_ptr.rx_bytes + pkt_len
stats.tx_bytes = stats_ptr.tx_bytes
net_stats_map.update(cgroup_id, stats)
else:
stats = net_stats()
stats.rx_packets = c_uint64(1)
stats.tx_packets = c_uint64(0)
stats.rx_bytes = pkt_len
stats.tx_bytes = c_uint64(0)
net_stats_map.update(cgroup_id, stats)
return c_int32(0)
@bpf
@section("kprobe/__dev_queue_xmit")
def trace_dev_xmit(ctx3: struct_pt_regs) -> c_int32:
cgroup_id = get_current_cgroup_id()
skb = struct_sk_buff(ctx3.di)
pkt_len = c_uint64(skb.len)
stats_ptr = net_stats_map.lookup(cgroup_id)
if stats_ptr:
stats = net_stats()
stats.rx_packets = stats_ptr.rx_packets
stats.tx_packets = stats_ptr.tx_packets + 1
stats.rx_bytes = stats_ptr.rx_bytes
stats.tx_bytes = stats_ptr.tx_bytes + pkt_len
net_stats_map.update(cgroup_id, stats)
else:
stats = net_stats()
stats.rx_packets = c_uint64(0)
stats.tx_packets = c_uint64(1)
stats.rx_bytes = c_uint64(0)
stats.tx_bytes = pkt_len
net_stats_map.update(cgroup_id, stats)
return c_int32(0)
# ==================== Syscall Tracing ====================
@bpf
@section("tracepoint/raw_syscalls/sys_enter")
def count_syscalls(ctx: c_void_p) -> c_int32:
cgroup_id = get_current_cgroup_id()
count_ptr = syscall_count.lookup(cgroup_id)
if count_ptr:
new_count = count_ptr + c_uint64(1)
syscall_count.update(cgroup_id, new_count)
else:
syscall_count.update(cgroup_id, c_uint64(1))
return c_int32(0)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
# ==================== Main ====================
if __name__ == "__main__":
print("🔥 Loading BPF programs...")
# Load and attach BPF program
b = BPF()
b.load()
b.attach_all()
# Get map references and enable struct deserialization
read_map_ref = b["read_map"]
write_map_ref = b["write_map"]
net_stats_map_ref = b["net_stats_map"]
syscall_count_ref = b["syscall_count"]
read_map_ref.set_value_struct("read_stats")
write_map_ref.set_value_struct("write_stats")
net_stats_map_ref.set_value_struct("net_stats")
print("✅ BPF programs loaded and attached")
# Setup data collector
collector = ContainerDataCollector(
read_map_ref, write_map_ref, net_stats_map_ref, syscall_count_ref
)
# Create and run TUI
tui = ContainerMonitorTUI(collector)
tui.run()

View File

@ -0,0 +1,208 @@
"""Data collection and management for container monitoring."""
import os
import time
from pathlib import Path
from typing import Dict, List, Set, Optional
from dataclasses import dataclass
from collections import deque, defaultdict
@dataclass
class CgroupInfo:
"""Information about a cgroup."""
id: int
name: str
path: str
@dataclass
class ContainerStats:
"""Statistics for a container/cgroup."""
cgroup_id: int
cgroup_name: str
# File I/O
read_ops: int = 0
read_bytes: int = 0
write_ops: int = 0
write_bytes: int = 0
# Network I/O
rx_packets: int = 0
rx_bytes: int = 0
tx_packets: int = 0
tx_bytes: int = 0
# Syscalls
syscall_count: int = 0
# Timestamp
timestamp: float = 0.0
class ContainerDataCollector:
"""Collects and manages container monitoring data from BPF."""
def __init__(
self, read_map, write_map, net_stats_map, syscall_map, history_size: int = 100
):
self.read_map = read_map
self.write_map = write_map
self.net_stats_map = net_stats_map
self.syscall_map = syscall_map
# Caching
self._cgroup_cache: Dict[int, CgroupInfo] = {}
self._cgroup_cache_time = 0
self._cache_ttl = 5.0
0 # Refresh cache every 5 seconds
# Historical data for graphing
self._history_size = history_size
self._history: Dict[int, deque] = defaultdict(
lambda: deque(maxlen=history_size)
)
def get_all_cgroups(self) -> List[CgroupInfo]:
"""Get all cgroups with caching."""
current_time = time.time()
# Use cached data if still valid
if current_time - self._cgroup_cache_time < self._cache_ttl:
return list(self._cgroup_cache.values())
# Refresh cache
self._refresh_cgroup_cache()
return list(self._cgroup_cache.values())
def _refresh_cgroup_cache(self):
"""Refresh the cgroup cache from /proc."""
cgroup_map: Dict[int, Set[str]] = defaultdict(set)
# Scan /proc to find all cgroups
for proc_dir in Path("/proc").glob("[0-9]*"):
try:
cgroup_file = proc_dir / "cgroup"
if not cgroup_file.exists():
continue
with open(cgroup_file) as f:
for line in f:
parts = line.strip().split(":")
if len(parts) >= 3:
cgroup_path = parts[2]
cgroup_mount = f"/sys/fs/cgroup{cgroup_path}"
if os.path.exists(cgroup_mount):
stat_info = os.stat(cgroup_mount)
cgroup_id = stat_info.st_ino
cgroup_map[cgroup_id].add(cgroup_path)
except (PermissionError, FileNotFoundError, OSError):
continue
# Update cache with best names
new_cache = {}
for cgroup_id, paths in cgroup_map.items():
# Pick the most descriptive path
best_path = self._get_best_cgroup_path(paths)
name = self._get_cgroup_name(best_path)
new_cache[cgroup_id] = CgroupInfo(id=cgroup_id, name=name, path=best_path)
self._cgroup_cache = new_cache
self._cgroup_cache_time = time.time()
def _get_best_cgroup_path(self, paths: Set[str]) -> str:
"""Select the most descriptive cgroup path."""
path_list = list(paths)
# Prefer paths with more components (more specific)
# Prefer paths containing docker, podman, etc.
for keyword in ["docker", "podman", "kubernetes", "k8s", "systemd"]:
for path in path_list:
if keyword in path.lower():
return path
# Return longest path (most specific)
return max(path_list, key=lambda p: (len(p.split("/")), len(p)))
def _get_cgroup_name(self, path: str) -> str:
"""Extract a friendly name from cgroup path."""
if not path or path == "/":
return "root"
# Remove leading/trailing slashes
path = path.strip("/")
# Try to extract container ID or service name
parts = path.split("/")
# For Docker: /docker/<container_id>
if "docker" in path.lower():
for i, part in enumerate(parts):
if part.lower() == "docker" and i + 1 < len(parts):
container_id = parts[i + 1][:12] # Short ID
return f"docker:{container_id}"
# For systemd services
if "system.slice" in path:
for part in parts:
if part.endswith(".service"):
return part.replace(".service", "")
# For user slices
if "user.slice" in path:
return f"user:{parts[-1]}" if parts else "user"
# Default: use last component
return parts[-1] if parts else path
def get_stats_for_cgroup(self, cgroup_id: int) -> ContainerStats:
"""Get current statistics for a specific cgroup."""
cgroup_info = self._cgroup_cache.get(cgroup_id)
cgroup_name = cgroup_info.name if cgroup_info else f"cgroup-{cgroup_id}"
stats = ContainerStats(
cgroup_id=cgroup_id, cgroup_name=cgroup_name, timestamp=time.time()
)
# Get file I/O stats
read_stat = self.read_map.lookup(cgroup_id)
if read_stat:
stats.read_ops = int(read_stat.ops)
stats.read_bytes = int(read_stat.bytes)
write_stat = self.write_map.lookup(cgroup_id)
if write_stat:
stats.write_ops = int(write_stat.ops)
stats.write_bytes = int(write_stat.bytes)
# Get network stats
net_stat = self.net_stats_map.lookup(cgroup_id)
if net_stat:
stats.rx_packets = int(net_stat.rx_packets)
stats.rx_bytes = int(net_stat.rx_bytes)
stats.tx_packets = int(net_stat.tx_packets)
stats.tx_bytes = int(net_stat.tx_bytes)
# Get syscall count
syscall_cnt = self.syscall_map.lookup(cgroup_id)
if syscall_cnt is not None:
stats.syscall_count = int(syscall_cnt)
# Add to history
self._history[cgroup_id].append(stats)
return stats
def get_history(self, cgroup_id: int) -> List[ContainerStats]:
"""Get historical statistics for graphing."""
return list(self._history[cgroup_id])
def get_cgroup_info(self, cgroup_id: int) -> Optional[CgroupInfo]:
"""Get cached cgroup information."""
return self._cgroup_cache.get(cgroup_id)

View File

@ -0,0 +1,752 @@
"""Terminal User Interface for container monitoring."""
import time
import curses
import threading
from typing import Optional, List
from data_collection import ContainerDataCollector
from web_dashboard import WebDashboard
def _safe_addstr(stdscr, y: int, x: int, text: str, *args):
"""Safely add string to screen with bounds checking."""
try:
height, width = stdscr.getmaxyx()
if 0 <= y < height and 0 <= x < width:
# Truncate text to fit
max_len = width - x - 1
if max_len > 0:
stdscr.addstr(y, x, text[:max_len], *args)
except curses.error:
pass
def _draw_fancy_header(stdscr, title: str, subtitle: str):
"""Draw a fancy header with title and subtitle."""
height, width = stdscr.getmaxyx()
# Top border
_safe_addstr(stdscr, 0, 0, "" * width, curses.color_pair(6) | curses.A_BOLD)
# Title
_safe_addstr(
stdscr,
0,
max(0, (width - len(title)) // 2),
f" {title} ",
curses.color_pair(6) | curses.A_BOLD,
)
# Subtitle
_safe_addstr(
stdscr,
1,
max(0, (width - len(subtitle)) // 2),
subtitle,
curses.color_pair(1),
)
# Bottom border
_safe_addstr(stdscr, 2, 0, "" * width, curses.color_pair(6))
def _draw_metric_box(
stdscr,
y: int,
x: int,
width: int,
label: str,
value: str,
detail: str,
color_pair: int,
):
"""Draw a fancy box for displaying a metric."""
height, _ = stdscr.getmaxyx()
if y + 4 >= height:
return
# Top border
_safe_addstr(
stdscr, y, x, "" + "" * (width - 2) + "", color_pair | curses.A_BOLD
)
# Label
_safe_addstr(stdscr, y + 1, x, "", color_pair | curses.A_BOLD)
_safe_addstr(stdscr, y + 1, x + 2, label, color_pair | curses.A_BOLD)
_safe_addstr(stdscr, y + 1, x + width - 1, "", color_pair | curses.A_BOLD)
# Value
_safe_addstr(stdscr, y + 2, x, "", color_pair | curses.A_BOLD)
_safe_addstr(stdscr, y + 2, x + 4, value, curses.color_pair(2) | curses.A_BOLD)
_safe_addstr(
stdscr,
y + 2,
min(x + width - len(detail) - 3, x + width - 2),
detail,
color_pair | curses.A_BOLD,
)
_safe_addstr(stdscr, y + 2, x + width - 1, "", color_pair | curses.A_BOLD)
# Bottom border
_safe_addstr(
stdscr, y + 3, x, "" + "" * (width - 2) + "", color_pair | curses.A_BOLD
)
def _draw_section_header(stdscr, y: int, title: str, color_pair: int):
"""Draw a section header."""
height, width = stdscr.getmaxyx()
if y >= height:
return
_safe_addstr(stdscr, y, 2, title, curses.color_pair(color_pair) | curses.A_BOLD)
_safe_addstr(
stdscr,
y,
len(title) + 3,
"" * (width - len(title) - 5),
curses.color_pair(color_pair) | curses.A_BOLD,
)
def _calculate_rates(history: List) -> dict:
"""Calculate per-second rates from history."""
if len(history) < 2:
return {
"syscalls_per_sec": 0.0,
"rx_bytes_per_sec": 0.0,
"tx_bytes_per_sec": 0.0,
"rx_pkts_per_sec": 0.0,
"tx_pkts_per_sec": 0.0,
"read_bytes_per_sec": 0.0,
"write_bytes_per_sec": 0.0,
"read_ops_per_sec": 0.0,
"write_ops_per_sec": 0.0,
}
# Calculate delta between last two samples
recent = history[-1]
previous = history[-2]
time_delta = recent.timestamp - previous.timestamp
if time_delta <= 0:
time_delta = 1.0
return {
"syscalls_per_sec": (recent.syscall_count - previous.syscall_count)
/ time_delta,
"rx_bytes_per_sec": (recent.rx_bytes - previous.rx_bytes) / time_delta,
"tx_bytes_per_sec": (recent.tx_bytes - previous.tx_bytes) / time_delta,
"rx_pkts_per_sec": (recent.rx_packets - previous.rx_packets) / time_delta,
"tx_pkts_per_sec": (recent.tx_packets - previous.tx_packets) / time_delta,
"read_bytes_per_sec": (recent.read_bytes - previous.read_bytes) / time_delta,
"write_bytes_per_sec": (recent.write_bytes - previous.write_bytes) / time_delta,
"read_ops_per_sec": (recent.read_ops - previous.read_ops) / time_delta,
"write_ops_per_sec": (recent.write_ops - previous.write_ops) / time_delta,
}
def _format_bytes(bytes_val: float) -> str:
"""Format bytes into human-readable string."""
if bytes_val < 0:
bytes_val = 0
for unit in ["B", "KB", "MB", "GB", "TB"]:
if bytes_val < 1024.0:
return f"{bytes_val:.1f}{unit}"
bytes_val /= 1024.0
return f"{bytes_val:.1f}PB"
def _draw_bar_graph_enhanced(
stdscr,
y: int,
x: int,
width: int,
height: int,
data: List[float],
color_pair: int,
):
"""Draw an enhanced bar graph with axis and scale."""
screen_height, screen_width = stdscr.getmaxyx()
if not data or width < 2 or y + height >= screen_height:
return
# Calculate statistics
max_val = max(data) if max(data) > 0 else 1
min_val = min(data)
avg_val = sum(data) / len(data)
# Take last 'width - 12' data points (leave room for Y-axis)
graph_width = max(1, width - 12)
recent_data = data[-graph_width:] if len(data) > graph_width else data
# Draw Y-axis labels (with bounds checking)
if y < screen_height:
_safe_addstr(
stdscr, y, x, f"{_format_bytes(max_val):>9}", curses.color_pair(7)
)
if y + height // 2 < screen_height:
_safe_addstr(
stdscr,
y + height // 2,
x,
f"{_format_bytes(avg_val):>9}",
curses.color_pair(7),
)
if y + height - 1 < screen_height:
_safe_addstr(
stdscr,
y + height - 1,
x,
f"{_format_bytes(min_val):>9}",
curses.color_pair(7),
)
# Draw bars
for row in range(height):
if y + row >= screen_height:
break
threshold = (height - row) / height
bar_line = ""
for val in recent_data:
normalized = val / max_val if max_val > 0 else 0
if normalized >= threshold:
bar_line += ""
elif normalized >= threshold - 0.15:
bar_line += ""
elif normalized >= threshold - 0.35:
bar_line += ""
elif normalized >= threshold - 0.5:
bar_line += ""
else:
bar_line += " "
_safe_addstr(stdscr, y + row, x + 11, bar_line, color_pair)
# Draw X-axis
if y + height < screen_height:
_safe_addstr(
stdscr,
y + height,
x + 10,
"" + "" * len(recent_data),
curses.color_pair(7),
)
_safe_addstr(
stdscr,
y + height,
x + 10 + len(recent_data),
"→ time",
curses.color_pair(7),
)
def _draw_labeled_graph(
stdscr,
y: int,
x: int,
width: int,
height: int,
label: str,
rate: str,
detail: str,
data: List[float],
color_pair: int,
description: str,
):
"""Draw a graph with labels and legend."""
screen_height, screen_width = stdscr.getmaxyx()
if y >= screen_height or y + height + 2 >= screen_height:
return
# Header with metrics
_safe_addstr(stdscr, y, x, label, curses.color_pair(1) | curses.A_BOLD)
_safe_addstr(stdscr, y, x + len(label) + 2, rate, curses.color_pair(2))
_safe_addstr(
stdscr, y, x + len(label) + len(rate) + 4, detail, curses.color_pair(7)
)
# Draw the graph
if len(data) > 1:
_draw_bar_graph_enhanced(stdscr, y + 1, x, width, height, data, color_pair)
else:
_safe_addstr(stdscr, y + 2, x + 2, "Collecting data...", curses.color_pair(7))
# Graph legend
if y + height + 1 < screen_height:
_safe_addstr(
stdscr, y + height + 1, x, f"└─ {description}", curses.color_pair(7)
)
class ContainerMonitorTUI:
"""TUI for container monitoring with cgroup selection and live graphs."""
def __init__(self, collector: ContainerDataCollector):
self.collector = collector
self.selected_cgroup: Optional[int] = None
self.current_screen = "selection" # "selection" or "monitoring"
self.selected_index = 0
self.scroll_offset = 0
self.web_dashboard = None
self.web_thread = None
def run(self):
"""Run the TUI application."""
curses.wrapper(self._main_loop)
def _main_loop(self, stdscr):
"""Main curses loop."""
# Configure curses
curses.curs_set(0) # Hide cursor
stdscr.nodelay(True) # Non-blocking input
stdscr.timeout(100) # Refresh every 100ms
# Initialize colors
curses.start_color()
curses.init_pair(1, curses.COLOR_CYAN, curses.COLOR_BLACK)
curses.init_pair(2, curses.COLOR_GREEN, curses.COLOR_BLACK)
curses.init_pair(3, curses.COLOR_YELLOW, curses.COLOR_BLACK)
curses.init_pair(4, curses.COLOR_RED, curses.COLOR_BLACK)
curses.init_pair(5, curses.COLOR_MAGENTA, curses.COLOR_BLACK)
curses.init_pair(6, curses.COLOR_WHITE, curses.COLOR_BLUE)
curses.init_pair(7, curses.COLOR_BLUE, curses.COLOR_BLACK)
curses.init_pair(8, curses.COLOR_WHITE, curses.COLOR_CYAN)
while True:
stdscr.clear()
try:
height, width = stdscr.getmaxyx()
# Check minimum terminal size
if height < 25 or width < 80:
msg = "Terminal too small! Minimum: 80x25"
stdscr.attron(curses.color_pair(4) | curses.A_BOLD)
stdscr.addstr(
height // 2, max(0, (width - len(msg)) // 2), msg[: width - 1]
)
stdscr.attroff(curses.color_pair(4) | curses.A_BOLD)
stdscr.refresh()
key = stdscr.getch()
if key == ord("q") or key == ord("Q"):
break
continue
if self.current_screen == "selection":
self._draw_selection_screen(stdscr)
elif self.current_screen == "monitoring":
self._draw_monitoring_screen(stdscr)
stdscr.refresh()
# Handle input
key = stdscr.getch()
if key != -1:
if not self._handle_input(key, stdscr):
break # Exit requested
except KeyboardInterrupt:
break
except curses.error:
# Curses error - likely terminal too small, just continue
pass
except Exception as e:
# Show error briefly
height, width = stdscr.getmaxyx()
error_msg = f"Error: {str(e)[: width - 10]}"
stdscr.addstr(0, 0, error_msg[: width - 1])
stdscr.refresh()
time.sleep(1)
def _draw_selection_screen(self, stdscr):
"""Draw the cgroup selection screen."""
height, width = stdscr.getmaxyx()
# Draw fancy header box
_draw_fancy_header(stdscr, "🐳 CONTAINER MONITOR", "Select a Cgroup to Monitor")
# Instructions
instructions = (
"↑↓: Navigate | ENTER: Select | w: Web Mode | q: Quit | r: Refresh"
)
_safe_addstr(
stdscr,
3,
max(0, (width - len(instructions)) // 2),
instructions,
curses.color_pair(3),
)
# Get cgroups
cgroups = self.collector.get_all_cgroups()
if not cgroups:
msg = "No cgroups found. Waiting for activity..."
_safe_addstr(
stdscr,
height // 2,
max(0, (width - len(msg)) // 2),
msg,
curses.color_pair(4),
)
return
# Sort cgroups by name
cgroups.sort(key=lambda c: c.name)
# Adjust selection bounds
if self.selected_index >= len(cgroups):
self.selected_index = len(cgroups) - 1
if self.selected_index < 0:
self.selected_index = 0
# Calculate visible range
list_height = max(1, height - 8)
if self.selected_index < self.scroll_offset:
self.scroll_offset = self.selected_index
elif self.selected_index >= self.scroll_offset + list_height:
self.scroll_offset = self.selected_index - list_height + 1
# Calculate max name length and ID width for alignment
max_name_len = min(50, max(len(cg.name) for cg in cgroups))
max_id_len = max(len(str(cg.id)) for cg in cgroups)
# Draw cgroup list with fancy borders
start_y = 5
_safe_addstr(
stdscr, start_y, 2, "" + "" * (width - 6) + "", curses.color_pair(1)
)
# Header row
header = f" {'CGROUP NAME':<{max_name_len}}{'ID':>{max_id_len}} "
_safe_addstr(stdscr, start_y + 1, 2, "", curses.color_pair(1))
_safe_addstr(
stdscr, start_y + 1, 3, header, curses.color_pair(1) | curses.A_BOLD
)
_safe_addstr(stdscr, start_y + 1, width - 3, "", curses.color_pair(1))
# Separator
_safe_addstr(
stdscr, start_y + 2, 2, "" + "" * (width - 6) + "", curses.color_pair(1)
)
for i in range(list_height):
idx = self.scroll_offset + i
y = start_y + 3 + i
if y >= height - 2:
break
_safe_addstr(stdscr, y, 2, "", curses.color_pair(1))
_safe_addstr(stdscr, y, width - 3, "", curses.color_pair(1))
if idx >= len(cgroups):
continue
cgroup = cgroups[idx]
# Truncate name if too long
display_name = (
cgroup.name
if len(cgroup.name) <= max_name_len
else cgroup.name[: max_name_len - 3] + "..."
)
if idx == self.selected_index:
# Highlight selected with proper alignment
line = f"{display_name:<{max_name_len}}{cgroup.id:>{max_id_len}} "
_safe_addstr(stdscr, y, 3, line, curses.color_pair(8) | curses.A_BOLD)
else:
line = f" {display_name:<{max_name_len}}{cgroup.id:>{max_id_len}} "
_safe_addstr(stdscr, y, 3, line, curses.color_pair(7))
# Bottom border
bottom_y = min(start_y + 3 + list_height, height - 3)
_safe_addstr(
stdscr, bottom_y, 2, "" + "" * (width - 6) + "", curses.color_pair(1)
)
# Footer
footer = f"Total: {len(cgroups)} cgroups"
if len(cgroups) > list_height:
footer += f" │ Showing {self.scroll_offset + 1}-{min(self.scroll_offset + list_height, len(cgroups))}"
_safe_addstr(
stdscr,
height - 2,
max(0, (width - len(footer)) // 2),
footer,
curses.color_pair(1),
)
def _draw_monitoring_screen(self, stdscr):
"""Draw the monitoring screen for selected cgroup."""
height, width = stdscr.getmaxyx()
if self.selected_cgroup is None:
return
# Get current stats
stats = self.collector.get_stats_for_cgroup(self.selected_cgroup)
history = self.collector.get_history(self.selected_cgroup)
# Draw fancy header
_draw_fancy_header(
stdscr, f"📊 {stats.cgroup_name[:40]}", "Live Performance Metrics"
)
# Instructions
instructions = "ESC/b: Back to List | w: Web Mode | q: Quit"
_safe_addstr(
stdscr,
3,
max(0, (width - len(instructions)) // 2),
instructions,
curses.color_pair(3),
)
# Calculate metrics for rate display
rates = _calculate_rates(history)
y = 5
# Syscall count in a fancy box
if y + 4 < height:
_draw_metric_box(
stdscr,
y,
2,
min(width - 4, 80),
"⚡ SYSTEM CALLS",
f"{stats.syscall_count:,}",
f"Rate: {rates['syscalls_per_sec']:.1f}/sec",
curses.color_pair(5),
)
y += 4
# Network I/O Section
if y + 8 < height:
_draw_section_header(stdscr, y, "🌐 NETWORK I/O", 1)
y += 1
# RX graph
rx_label = f"RX: {_format_bytes(stats.rx_bytes)}"
rx_rate = f"{_format_bytes(rates['rx_bytes_per_sec'])}/s"
rx_pkts = f"{stats.rx_packets:,} pkts ({rates['rx_pkts_per_sec']:.1f}/s)"
_draw_labeled_graph(
stdscr,
y,
2,
width - 4,
4,
rx_label,
rx_rate,
rx_pkts,
[s.rx_bytes for s in history],
curses.color_pair(2),
"Received Traffic (last 100 samples)",
)
y += 6
# TX graph
if y + 8 < height:
tx_label = f"TX: {_format_bytes(stats.tx_bytes)}"
tx_rate = f"{_format_bytes(rates['tx_bytes_per_sec'])}/s"
tx_pkts = f"{stats.tx_packets:,} pkts ({rates['tx_pkts_per_sec']:.1f}/s)"
_draw_labeled_graph(
stdscr,
y,
2,
width - 4,
4,
tx_label,
tx_rate,
tx_pkts,
[s.tx_bytes for s in history],
curses.color_pair(3),
"Transmitted Traffic (last 100 samples)",
)
y += 6
# File I/O Section
if y + 8 < height:
_draw_section_header(stdscr, y, "💾 FILE I/O", 1)
y += 1
# Read graph
read_label = f"READ: {_format_bytes(stats.read_bytes)}"
read_rate = f"{_format_bytes(rates['read_bytes_per_sec'])}/s"
read_ops = f"{stats.read_ops:,} ops ({rates['read_ops_per_sec']:.1f}/s)"
_draw_labeled_graph(
stdscr,
y,
2,
width - 4,
4,
read_label,
read_rate,
read_ops,
[s.read_bytes for s in history],
curses.color_pair(4),
"Read Operations (last 100 samples)",
)
y += 6
# Write graph
if y + 8 < height:
write_label = f"WRITE: {_format_bytes(stats.write_bytes)}"
write_rate = f"{_format_bytes(rates['write_bytes_per_sec'])}/s"
write_ops = f"{stats.write_ops:,} ops ({rates['write_ops_per_sec']:.1f}/s)"
_draw_labeled_graph(
stdscr,
y,
2,
width - 4,
4,
write_label,
write_rate,
write_ops,
[s.write_bytes for s in history],
curses.color_pair(5),
"Write Operations (last 100 samples)",
)
def _launch_web_mode(self, stdscr):
"""Launch web dashboard mode."""
height, width = stdscr.getmaxyx()
# Show transition message
stdscr.clear()
msg1 = "🌐 LAUNCHING WEB DASHBOARD"
_safe_addstr(
stdscr,
height // 2 - 2,
max(0, (width - len(msg1)) // 2),
msg1,
curses.color_pair(6) | curses.A_BOLD,
)
msg2 = "Server starting at http://localhost:8050"
_safe_addstr(
stdscr,
height // 2,
max(0, (width - len(msg2)) // 2),
msg2,
curses.color_pair(2),
)
msg3 = "Press 'q' to stop web server and return to TUI"
_safe_addstr(
stdscr,
height // 2 + 2,
max(0, (width - len(msg3)) // 2),
msg3,
curses.color_pair(3),
)
stdscr.refresh()
time.sleep(1)
try:
# Create and start web dashboard
self.web_dashboard = WebDashboard(
self.collector, selected_cgroup=self.selected_cgroup
)
# Start in background thread
self.web_thread = threading.Thread(
target=self.web_dashboard.run, daemon=True
)
self.web_thread.start()
time.sleep(2) # Give server time to start
# Wait for user to press 'q' to return
msg4 = "Web dashboard running at http://localhost:8050"
msg5 = "Press 'q' to return to TUI"
_safe_addstr(
stdscr,
height // 2 + 4,
max(0, (width - len(msg4)) // 2),
msg4,
curses.color_pair(1) | curses.A_BOLD,
)
_safe_addstr(
stdscr,
height // 2 + 5,
max(0, (width - len(msg5)) // 2),
msg5,
curses.color_pair(3) | curses.A_BOLD,
)
stdscr.refresh()
stdscr.nodelay(False) # Blocking mode
while True:
key = stdscr.getch()
if key == ord("q") or key == ord("Q"):
break
# Stop web server
if self.web_dashboard:
self.web_dashboard.stop()
except Exception as e:
error_msg = f"Error starting web dashboard: {str(e)}"
_safe_addstr(
stdscr,
height // 2 + 4,
max(0, (width - len(error_msg)) // 2),
error_msg,
curses.color_pair(4),
)
stdscr.refresh()
time.sleep(3)
# Restore TUI settings
stdscr.nodelay(True)
stdscr.timeout(100)
def _handle_input(self, key: int, stdscr) -> bool:
"""Handle keyboard input. Returns False to exit."""
if key == ord("q") or key == ord("Q"):
return False # Exit
if key == ord("w") or key == ord("W"):
# Launch web mode
self._launch_web_mode(stdscr)
return True
if self.current_screen == "selection":
if key == curses.KEY_UP:
self.selected_index = max(0, self.selected_index - 1)
elif key == curses.KEY_DOWN:
cgroups = self.collector.get_all_cgroups()
self.selected_index = min(len(cgroups) - 1, self.selected_index + 1)
elif key == ord("\n") or key == curses.KEY_ENTER or key == 10:
# Select cgroup
cgroups = self.collector.get_all_cgroups()
if cgroups and 0 <= self.selected_index < len(cgroups):
cgroups.sort(key=lambda c: c.name)
self.selected_cgroup = cgroups[self.selected_index].id
self.current_screen = "monitoring"
elif key == ord("r") or key == ord("R"):
# Force refresh cache
self.collector._cgroup_cache_time = 0
elif self.current_screen == "monitoring":
if key == 27 or key == ord("b") or key == ord("B"): # ESC or 'b'
self.current_screen = "selection"
self.selected_cgroup = None
return True # Continue running

View File

@ -0,0 +1,826 @@
"""Beautiful web dashboard for container monitoring using Plotly Dash."""
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from typing import Optional
from data_collection import ContainerDataCollector
class WebDashboard:
"""Beautiful web dashboard for container monitoring."""
def __init__(
self,
collector: ContainerDataCollector,
selected_cgroup: Optional[int] = None,
host: str = "0.0.0.0",
port: int = 8050,
):
self.collector = collector
self.selected_cgroup = selected_cgroup
self.host = host
self.port = port
# Suppress Dash dev tools and debug output
self.app = dash.Dash(
__name__,
title="pythonBPF Container Monitor",
suppress_callback_exceptions=True,
)
self._setup_layout()
self._setup_callbacks()
self._running = False
def _setup_layout(self):
"""Create the dashboard layout."""
self.app.layout = html.Div(
[
# Futuristic Header with pythonBPF branding
html.Div(
[
html.Div(
[
html.Div(
[
html.Span(
"python",
style={
"fontSize": "52px",
"fontWeight": "300",
"color": "#00ff88",
"fontFamily": "'Courier New', monospace",
"textShadow": "0 0 20px rgba(0,255,136,0.5)",
},
),
html.Span(
"BPF",
style={
"fontSize": "52px",
"fontWeight": "900",
"color": "#00d4ff",
"fontFamily": "'Courier New', monospace",
"textShadow": "0 0 20px rgba(0,212,255,0.5)",
},
),
],
style={"marginBottom": "5px"},
),
html.Div(
"CONTAINER PERFORMANCE MONITOR",
style={
"fontSize": "16px",
"letterSpacing": "8px",
"color": "#8899ff",
"fontWeight": "300",
"fontFamily": "'Courier New', monospace",
},
),
],
style={
"textAlign": "center",
},
),
html.Div(
id="cgroup-name",
style={
"textAlign": "center",
"color": "#00ff88",
"fontSize": "20px",
"marginTop": "15px",
"fontFamily": "'Courier New', monospace",
"fontWeight": "bold",
"textShadow": "0 0 10px rgba(0,255,136,0.3)",
},
),
],
style={
"background": "linear-gradient(135deg, #0a0e27 0%, #1a1f3a 50%, #0a0e27 100%)",
"padding": "40px 20px",
"borderRadius": "0",
"marginBottom": "0",
"boxShadow": "0 10px 40px rgba(0,212,255,0.2)",
"border": "1px solid rgba(0,212,255,0.3)",
"borderTop": "3px solid #00d4ff",
"borderBottom": "3px solid #00ff88",
"position": "relative",
"overflow": "hidden",
},
),
# Cgroup selector (if no cgroup selected)
html.Div(
[
html.Label(
"SELECT CGROUP:",
style={
"fontSize": "14px",
"fontWeight": "bold",
"color": "#00d4ff",
"marginRight": "15px",
"fontFamily": "'Courier New', monospace",
"letterSpacing": "2px",
},
),
dcc.Dropdown(
id="cgroup-selector",
style={
"width": "600px",
"display": "inline-block",
"background": "#1a1f3a",
"border": "1px solid #00d4ff",
},
),
],
id="selector-container",
style={
"textAlign": "center",
"marginTop": "30px",
"marginBottom": "30px",
"padding": "20px",
"background": "rgba(26,31,58,0.5)",
"borderRadius": "10px",
"border": "1px solid rgba(0,212,255,0.2)",
"display": "block" if self.selected_cgroup is None else "none",
},
),
# Stats cards row
html.Div(
[
self._create_stat_card(
"syscall-card", "⚡ SYSCALLS", "#00ff88"
),
self._create_stat_card("network-card", "🌐 NETWORK", "#00d4ff"),
self._create_stat_card("file-card", "💾 FILE I/O", "#ff0088"),
],
style={
"display": "flex",
"justifyContent": "space-around",
"marginBottom": "30px",
"marginTop": "30px",
"gap": "25px",
"flexWrap": "wrap",
"padding": "0 20px",
},
),
# Graphs container
html.Div(
[
# Network graphs
html.Div(
[
html.Div(
[
html.Span("🌐 ", style={"fontSize": "24px"}),
html.Span(
"NETWORK",
style={
"fontFamily": "'Courier New', monospace",
"letterSpacing": "3px",
"fontWeight": "bold",
},
),
html.Span(
" I/O",
style={
"fontFamily": "'Courier New', monospace",
"letterSpacing": "3px",
"color": "#00d4ff",
},
),
],
style={
"color": "#ffffff",
"fontSize": "20px",
"borderBottom": "2px solid #00d4ff",
"paddingBottom": "15px",
"marginBottom": "25px",
"textShadow": "0 0 10px rgba(0,212,255,0.3)",
},
),
dcc.Graph(
id="network-graph", style={"height": "400px"}
),
],
style={
"background": "linear-gradient(135deg, #0a0e27 0%, #1a1f3a 100%)",
"padding": "30px",
"borderRadius": "15px",
"boxShadow": "0 8px 32px rgba(0,212,255,0.15)",
"marginBottom": "30px",
"border": "1px solid rgba(0,212,255,0.2)",
},
),
# File I/O graphs
html.Div(
[
html.Div(
[
html.Span("💾 ", style={"fontSize": "24px"}),
html.Span(
"FILE",
style={
"fontFamily": "'Courier New', monospace",
"letterSpacing": "3px",
"fontWeight": "bold",
},
),
html.Span(
" I/O",
style={
"fontFamily": "'Courier New', monospace",
"letterSpacing": "3px",
"color": "#ff0088",
},
),
],
style={
"color": "#ffffff",
"fontSize": "20px",
"borderBottom": "2px solid #ff0088",
"paddingBottom": "15px",
"marginBottom": "25px",
"textShadow": "0 0 10px rgba(255,0,136,0.3)",
},
),
dcc.Graph(
id="file-io-graph", style={"height": "400px"}
),
],
style={
"background": "linear-gradient(135deg, #0a0e27 0%, #1a1f3a 100%)",
"padding": "30px",
"borderRadius": "15px",
"boxShadow": "0 8px 32px rgba(255,0,136,0.15)",
"marginBottom": "30px",
"border": "1px solid rgba(255,0,136,0.2)",
},
),
# Combined time series
html.Div(
[
html.Div(
[
html.Span("📈 ", style={"fontSize": "24px"}),
html.Span(
"REAL-TIME",
style={
"fontFamily": "'Courier New', monospace",
"letterSpacing": "3px",
"fontWeight": "bold",
},
),
html.Span(
" METRICS",
style={
"fontFamily": "'Courier New', monospace",
"letterSpacing": "3px",
"color": "#00ff88",
},
),
],
style={
"color": "#ffffff",
"fontSize": "20px",
"borderBottom": "2px solid #00ff88",
"paddingBottom": "15px",
"marginBottom": "25px",
"textShadow": "0 0 10px rgba(0,255,136,0.3)",
},
),
dcc.Graph(
id="timeseries-graph", style={"height": "500px"}
),
],
style={
"background": "linear-gradient(135deg, #0a0e27 0%, #1a1f3a 100%)",
"padding": "30px",
"borderRadius": "15px",
"boxShadow": "0 8px 32px rgba(0,255,136,0.15)",
"border": "1px solid rgba(0,255,136,0.2)",
},
),
],
style={"padding": "0 20px"},
),
# Footer with pythonBPF branding
html.Div(
[
html.Div(
[
html.Span(
"Powered by ",
style={"color": "#8899ff", "fontSize": "12px"},
),
html.Span(
"pythonBPF",
style={
"color": "#00d4ff",
"fontSize": "14px",
"fontWeight": "bold",
"fontFamily": "'Courier New', monospace",
},
),
html.Span(
" | eBPF Container Monitoring",
style={
"color": "#8899ff",
"fontSize": "12px",
"marginLeft": "10px",
},
),
]
)
],
style={
"textAlign": "center",
"padding": "20px",
"marginTop": "40px",
"background": "linear-gradient(135deg, #0a0e27 0%, #1a1f3a 100%)",
"borderTop": "1px solid rgba(0,212,255,0.2)",
},
),
# Auto-update interval
dcc.Interval(id="interval-component", interval=1000, n_intervals=0),
],
style={
"padding": "0",
"fontFamily": "'Segoe UI', 'Courier New', monospace",
"background": "linear-gradient(to bottom, #050813 0%, #0a0e27 100%)",
"minHeight": "100vh",
"margin": "0",
},
)
def _create_stat_card(self, card_id: str, title: str, color: str):
"""Create a statistics card with futuristic styling."""
return html.Div(
[
html.H3(
title,
style={
"color": color,
"fontSize": "16px",
"marginBottom": "20px",
"fontWeight": "bold",
"fontFamily": "'Courier New', monospace",
"letterSpacing": "2px",
"textShadow": f"0 0 10px {color}50",
},
),
html.Div(
[
html.Div(
id=f"{card_id}-value",
style={
"fontSize": "42px",
"fontWeight": "bold",
"color": "#ffffff",
"marginBottom": "10px",
"fontFamily": "'Courier New', monospace",
"textShadow": f"0 0 20px {color}40",
},
),
html.Div(
id=f"{card_id}-rate",
style={
"fontSize": "14px",
"color": "#8899ff",
"fontFamily": "'Courier New', monospace",
},
),
]
),
],
style={
"flex": "1",
"minWidth": "280px",
"background": "linear-gradient(135deg, #0a0e27 0%, #1a1f3a 100%)",
"padding": "30px",
"borderRadius": "15px",
"boxShadow": f"0 8px 32px {color}20",
"border": f"1px solid {color}40",
"borderLeft": f"4px solid {color}",
"transition": "transform 0.3s, box-shadow 0.3s",
"position": "relative",
"overflow": "hidden",
},
)
def _setup_callbacks(self):
"""Setup dashboard callbacks."""
@self.app.callback(
[Output("cgroup-selector", "options"), Output("cgroup-selector", "value")],
[Input("interval-component", "n_intervals")],
)
def update_cgroup_selector(n):
if self.selected_cgroup is not None:
return [], self.selected_cgroup
cgroups = self.collector.get_all_cgroups()
options = [
{"label": f"{cg.name} (ID: {cg.id})", "value": cg.id}
for cg in sorted(cgroups, key=lambda c: c.name)
]
value = options[0]["value"] if options else None
if value and self.selected_cgroup is None:
self.selected_cgroup = value
return options, self.selected_cgroup
@self.app.callback(
Output("cgroup-selector", "value", allow_duplicate=True),
[Input("cgroup-selector", "value")],
prevent_initial_call=True,
)
def select_cgroup(value):
if value:
self.selected_cgroup = value
return value
@self.app.callback(
[
Output("cgroup-name", "children"),
Output("syscall-card-value", "children"),
Output("syscall-card-rate", "children"),
Output("network-card-value", "children"),
Output("network-card-rate", "children"),
Output("file-card-value", "children"),
Output("file-card-rate", "children"),
Output("network-graph", "figure"),
Output("file-io-graph", "figure"),
Output("timeseries-graph", "figure"),
],
[Input("interval-component", "n_intervals")],
)
def update_dashboard(n):
if self.selected_cgroup is None:
empty_fig = self._create_empty_figure(
"Select a cgroup to begin monitoring"
)
return (
"SELECT A CGROUP TO START",
"0",
"",
"0 B",
"",
"0 B",
"",
empty_fig,
empty_fig,
empty_fig,
)
try:
stats = self.collector.get_stats_for_cgroup(self.selected_cgroup)
history = self.collector.get_history(self.selected_cgroup)
rates = self._calculate_rates(history)
return (
f"{stats.cgroup_name}",
f"{stats.syscall_count:,}",
f"{rates['syscalls_per_sec']:.1f} calls/sec",
f"{self._format_bytes(stats.rx_bytes + stats.tx_bytes)}",
f"{self._format_bytes(rates['rx_bytes_per_sec'])}/s ↑ {self._format_bytes(rates['tx_bytes_per_sec'])}/s",
f"{self._format_bytes(stats.read_bytes + stats.write_bytes)}",
f"R: {self._format_bytes(rates['read_bytes_per_sec'])}/s W: {self._format_bytes(rates['write_bytes_per_sec'])}/s",
self._create_network_graph(history),
self._create_file_io_graph(history),
self._create_timeseries_graph(history),
)
except Exception as e:
empty_fig = self._create_empty_figure(f"Error: {str(e)}")
return (
"ERROR",
"0",
str(e),
"0 B",
"",
"0 B",
"",
empty_fig,
empty_fig,
empty_fig,
)
def _create_empty_figure(self, message: str):
"""Create an empty figure with a message."""
fig = go.Figure()
fig.update_layout(
title=message,
template="plotly_dark",
paper_bgcolor="#0a0e27",
plot_bgcolor="#0a0e27",
font=dict(color="#8899ff", family="Courier New, monospace"),
)
return fig
def _create_network_graph(self, history):
"""Create network I/O graph with futuristic styling."""
if len(history) < 2:
return self._create_empty_figure("Collecting data...")
times = [i for i in range(len(history))]
rx_bytes = [s.rx_bytes for s in history]
tx_bytes = [s.tx_bytes for s in history]
fig = make_subplots(
rows=2,
cols=1,
subplot_titles=("RECEIVED (RX)", "TRANSMITTED (TX)"),
vertical_spacing=0.15,
)
fig.add_trace(
go.Scatter(
x=times,
y=rx_bytes,
mode="lines",
name="RX",
fill="tozeroy",
line=dict(color="#00d4ff", width=3, shape="spline"),
fillcolor="rgba(0, 212, 255, 0.2)",
),
row=1,
col=1,
)
fig.add_trace(
go.Scatter(
x=times,
y=tx_bytes,
mode="lines",
name="TX",
fill="tozeroy",
line=dict(color="#00ff88", width=3, shape="spline"),
fillcolor="rgba(0, 255, 136, 0.2)",
),
row=2,
col=1,
)
fig.update_xaxes(title_text="Time (samples)", row=2, col=1, color="#8899ff")
fig.update_yaxes(title_text="Bytes", row=1, col=1, color="#8899ff")
fig.update_yaxes(title_text="Bytes", row=2, col=1, color="#8899ff")
fig.update_layout(
height=400,
template="plotly_dark",
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="#0a0e27",
showlegend=False,
hovermode="x unified",
font=dict(family="Courier New, monospace", color="#8899ff"),
)
return fig
def _create_file_io_graph(self, history):
"""Create file I/O graph with futuristic styling."""
if len(history) < 2:
return self._create_empty_figure("Collecting data...")
times = [i for i in range(len(history))]
read_bytes = [s.read_bytes for s in history]
write_bytes = [s.write_bytes for s in history]
fig = make_subplots(
rows=2,
cols=1,
subplot_titles=("READ OPERATIONS", "WRITE OPERATIONS"),
vertical_spacing=0.15,
)
fig.add_trace(
go.Scatter(
x=times,
y=read_bytes,
mode="lines",
name="Read",
fill="tozeroy",
line=dict(color="#ff0088", width=3, shape="spline"),
fillcolor="rgba(255, 0, 136, 0.2)",
),
row=1,
col=1,
)
fig.add_trace(
go.Scatter(
x=times,
y=write_bytes,
mode="lines",
name="Write",
fill="tozeroy",
line=dict(color="#8844ff", width=3, shape="spline"),
fillcolor="rgba(136, 68, 255, 0.2)",
),
row=2,
col=1,
)
fig.update_xaxes(title_text="Time (samples)", row=2, col=1, color="#8899ff")
fig.update_yaxes(title_text="Bytes", row=1, col=1, color="#8899ff")
fig.update_yaxes(title_text="Bytes", row=2, col=1, color="#8899ff")
fig.update_layout(
height=400,
template="plotly_dark",
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="#0a0e27",
showlegend=False,
hovermode="x unified",
font=dict(family="Courier New, monospace", color="#8899ff"),
)
return fig
def _create_timeseries_graph(self, history):
"""Create combined time series graph with futuristic styling."""
if len(history) < 2:
return self._create_empty_figure("Collecting data...")
times = [i for i in range(len(history))]
fig = make_subplots(
rows=3,
cols=1,
subplot_titles=(
"SYSTEM CALLS",
"NETWORK TRAFFIC (Bytes)",
"FILE I/O (Bytes)",
),
vertical_spacing=0.1,
specs=[
[{"secondary_y": False}],
[{"secondary_y": True}],
[{"secondary_y": True}],
],
)
# Syscalls
fig.add_trace(
go.Scatter(
x=times,
y=[s.syscall_count for s in history],
mode="lines",
name="Syscalls",
line=dict(color="#00ff88", width=3, shape="spline"),
),
row=1,
col=1,
)
# Network
fig.add_trace(
go.Scatter(
x=times,
y=[s.rx_bytes for s in history],
mode="lines",
name="RX",
line=dict(color="#00d4ff", width=2, shape="spline"),
),
row=2,
col=1,
secondary_y=False,
)
fig.add_trace(
go.Scatter(
x=times,
y=[s.tx_bytes for s in history],
mode="lines",
name="TX",
line=dict(color="#00ff88", width=2, shape="spline", dash="dot"),
),
row=2,
col=1,
secondary_y=True,
)
# File I/O
fig.add_trace(
go.Scatter(
x=times,
y=[s.read_bytes for s in history],
mode="lines",
name="Read",
line=dict(color="#ff0088", width=2, shape="spline"),
),
row=3,
col=1,
secondary_y=False,
)
fig.add_trace(
go.Scatter(
x=times,
y=[s.write_bytes for s in history],
mode="lines",
name="Write",
line=dict(color="#8844ff", width=2, shape="spline", dash="dot"),
),
row=3,
col=1,
secondary_y=True,
)
fig.update_xaxes(title_text="Time (samples)", row=3, col=1, color="#8899ff")
fig.update_yaxes(title_text="Count", row=1, col=1, color="#8899ff")
fig.update_yaxes(
title_text="RX Bytes", row=2, col=1, secondary_y=False, color="#00d4ff"
)
fig.update_yaxes(
title_text="TX Bytes", row=2, col=1, secondary_y=True, color="#00ff88"
)
fig.update_yaxes(
title_text="Read Bytes", row=3, col=1, secondary_y=False, color="#ff0088"
)
fig.update_yaxes(
title_text="Write Bytes", row=3, col=1, secondary_y=True, color="#8844ff"
)
fig.update_layout(
height=500,
template="plotly_dark",
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="#0a0e27",
hovermode="x unified",
showlegend=True,
legend=dict(
orientation="h",
yanchor="bottom",
y=1.02,
xanchor="right",
x=1,
font=dict(color="#8899ff"),
),
font=dict(family="Courier New, monospace", color="#8899ff"),
)
return fig
def _calculate_rates(self, history):
"""Calculate rates from history."""
if len(history) < 2:
return {
"syscalls_per_sec": 0.0,
"rx_bytes_per_sec": 0.0,
"tx_bytes_per_sec": 0.0,
"read_bytes_per_sec": 0.0,
"write_bytes_per_sec": 0.0,
}
recent = history[-1]
previous = history[-2]
time_delta = recent.timestamp - previous.timestamp
if time_delta <= 0:
time_delta = 1.0
return {
"syscalls_per_sec": max(
0, (recent.syscall_count - previous.syscall_count) / time_delta
),
"rx_bytes_per_sec": max(
0, (recent.rx_bytes - previous.rx_bytes) / time_delta
),
"tx_bytes_per_sec": max(
0, (recent.tx_bytes - previous.tx_bytes) / time_delta
),
"read_bytes_per_sec": max(
0, (recent.read_bytes - previous.read_bytes) / time_delta
),
"write_bytes_per_sec": max(
0, (recent.write_bytes - previous.write_bytes) / time_delta
),
}
def _format_bytes(self, bytes_val: float) -> str:
"""Format bytes into human-readable string."""
if bytes_val < 0:
bytes_val = 0
for unit in ["B", "KB", "MB", "GB", "TB"]:
if bytes_val < 1024.0:
return f"{bytes_val:.2f} {unit}"
bytes_val /= 1024.0
return f"{bytes_val:.2f} PB"
def run(self):
"""Run the web dashboard."""
self._running = True
# Suppress Werkzeug logging
import logging
log = logging.getLogger("werkzeug")
log.setLevel(logging.ERROR)
self.app.run(debug=False, host=self.host, port=self.port, use_reloader=False)
def stop(self):
"""Stop the web dashboard."""
self._running = False

View File

@ -1,4 +1,4 @@
from pythonbpf import bpf, section, bpfglobal, BPF
from pythonbpf import bpf, section, bpfglobal, BPF, trace_pipe
from ctypes import c_void_p, c_int64
# Instructions to how to run this program
@ -21,10 +21,6 @@ def LICENSE() -> str:
b = BPF()
b.load_and_attach()
if b.is_loaded() and b.is_attached():
print("Successfully loaded and attached")
else:
print("Could not load successfully")
# Now cat /sys/kernel/debug/tracing/trace_pipe to see results of the execve syscall.
b.load()
b.attach_all()
trace_pipe()

View File

@ -1,4 +1,4 @@
from pythonbpf import bpf, section, bpfglobal, BPF
from pythonbpf import bpf, section, bpfglobal, BPF, trace_pipe
from ctypes import c_void_p, c_int64
@ -23,7 +23,7 @@ def LICENSE() -> str:
b = BPF()
b.load_and_attach()
while True:
print("running")
# Now cat /sys/kernel/debug/tracing/trace_pipe to see results of unlink kprobe.
b.load()
b.attach_all()
print("running")
trace_pipe()

View File

@ -23,14 +23,14 @@ def count() -> HashMap:
def hello_world(ctx: c_void_p) -> c_int64:
key = 0
one = 1
prev = count().lookup(key)
prev = count.lookup(key)
if prev:
prevval = prev + 1
print(f"count: {prevval}")
count().update(key, prevval)
count.update(key, prevval)
return XDP_PASS
else:
count().update(key, one)
count.update(key, one)
return XDP_PASS

View File

@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "pythonbpf"
version = "0.1.6"
version = "0.1.8"
description = "Reduced Python frontend for eBPF"
authors = [
{ name = "r41k0u", email="pragyanshchaturvedi18@gmail.com" },
@ -26,10 +26,10 @@ classifiers = [
]
readme = "README.md"
license = {text = "Apache-2.0"}
requires-python = ">=3.8"
requires-python = ">=3.10"
dependencies = [
"llvmlite",
"llvmlite>=0.45",
"astpretty",
"pylibbpf"
]

View File

@ -1,28 +1,17 @@
import ast
import logging
import ctypes
from llvmlite import ir
from dataclasses import dataclass
from typing import Any
from .local_symbol import LocalSymbol
from pythonbpf.helper import HelperHandlerRegistry
from pythonbpf.vmlinux_parser.dependency_node import Field
from .expr import VmlinuxHandlerRegistry
from pythonbpf.type_deducer import ctypes_to_ir
from pythonbpf.maps import BPFMapType
logger = logging.getLogger(__name__)
@dataclass
class LocalSymbol:
var: ir.AllocaInstr
ir_type: ir.Type
metadata: Any = None
def __iter__(self):
yield self.var
yield self.ir_type
yield self.metadata
def create_targets_and_rvals(stmt):
"""Create lists of targets and right-hand values from an assignment statement."""
if isinstance(stmt.targets[0], ast.Tuple):
@ -37,7 +26,9 @@ def create_targets_and_rvals(stmt):
return stmt.targets, [stmt.value]
def handle_assign_allocation(builder, stmt, local_sym_tab, structs_sym_tab):
def handle_assign_allocation(
builder, stmt, local_sym_tab, map_sym_tab, structs_sym_tab
):
"""Handle memory allocation for assignment statements."""
logger.info(f"Handling assignment for allocation: {ast.dump(stmt)}")
@ -60,24 +51,16 @@ def handle_assign_allocation(builder, stmt, local_sym_tab, structs_sym_tab):
continue
var_name = target.id
# Skip if already allocated
if var_name in local_sym_tab:
logger.debug(f"Variable {var_name} already allocated, skipping")
continue
# When allocating a variable, check if it's a vmlinux struct type
if isinstance(
stmt.value, ast.Name
) and VmlinuxHandlerRegistry.is_vmlinux_struct(stmt.value.id):
# Handle vmlinux struct allocation
# This requires more implementation
print(stmt.value)
pass
# Determine type and allocate based on rval
if isinstance(rval, ast.Call):
_allocate_for_call(builder, var_name, rval, local_sym_tab, structs_sym_tab)
_allocate_for_call(
builder, var_name, rval, local_sym_tab, map_sym_tab, structs_sym_tab
)
elif isinstance(rval, ast.Constant):
_allocate_for_constant(builder, var_name, rval, local_sym_tab)
elif isinstance(rval, ast.BinOp):
@ -96,14 +79,16 @@ def handle_assign_allocation(builder, stmt, local_sym_tab, structs_sym_tab):
)
def _allocate_for_call(builder, var_name, rval, local_sym_tab, structs_sym_tab):
def _allocate_for_call(
builder, var_name, rval, local_sym_tab, map_sym_tab, structs_sym_tab
):
"""Allocate memory for variable assigned from a call."""
if isinstance(rval.func, ast.Name):
call_type = rval.func.id
# C type constructors
if call_type in ("c_int32", "c_int64", "c_uint32", "c_uint64"):
if call_type in ("c_int32", "c_int64", "c_uint32", "c_uint64", "c_void_p"):
ir_type = ctypes_to_ir(call_type)
var = builder.alloca(ir_type, name=var_name)
var.align = ir_type.width // 8
@ -129,24 +114,108 @@ def _allocate_for_call(builder, var_name, rval, local_sym_tab, structs_sym_tab):
# Struct constructors
elif call_type in structs_sym_tab:
struct_info = structs_sym_tab[call_type]
var = builder.alloca(struct_info.ir_type, name=var_name)
local_sym_tab[var_name] = LocalSymbol(var, struct_info.ir_type, call_type)
logger.info(f"Pre-allocated {var_name} for struct {call_type}")
if len(rval.args) == 0:
# Zero-arg constructor: allocate the struct itself
var = builder.alloca(struct_info.ir_type, name=var_name)
local_sym_tab[var_name] = LocalSymbol(
var, struct_info.ir_type, call_type
)
logger.info(f"Pre-allocated {var_name} for struct {call_type}")
else:
# Pointer cast: allocate as pointer to struct
ptr_type = ir.PointerType(struct_info.ir_type)
var = builder.alloca(ptr_type, name=var_name)
var.align = 8
local_sym_tab[var_name] = LocalSymbol(var, ptr_type, call_type)
logger.info(
f"Pre-allocated {var_name} for struct pointer cast to {call_type}"
)
elif VmlinuxHandlerRegistry.is_vmlinux_struct(call_type):
# When calling struct_name(pointer), we're doing a cast, not construction
# So we allocate as a pointer (i64) not as the actual struct
var = builder.alloca(ir.IntType(64), name=var_name)
var.align = 8
local_sym_tab[var_name] = LocalSymbol(
var, ir.IntType(64), VmlinuxHandlerRegistry.get_struct_type(call_type)
)
logger.info(
f"Pre-allocated {var_name} for vmlinux struct pointer cast to {call_type}"
)
else:
logger.warning(f"Unknown call type for allocation: {call_type}")
elif isinstance(rval.func, ast.Attribute):
# Map method calls - need double allocation for ptr handling
_allocate_for_map_method(builder, var_name, local_sym_tab)
_allocate_for_map_method(
builder, var_name, rval, local_sym_tab, map_sym_tab, structs_sym_tab
)
else:
logger.warning(f"Unsupported call function type for {var_name}")
def _allocate_for_map_method(builder, var_name, local_sym_tab):
def _allocate_for_map_method(
builder, var_name, rval, local_sym_tab, map_sym_tab, structs_sym_tab
):
"""Allocate memory for variable assigned from map method (double alloc)."""
map_name = rval.func.value.id
method_name = rval.func.attr
# NOTE: We will have to special case HashMap.lookup which returns a pointer to value type
# The value type can be a struct as well, so we need to handle that properly
# This special casing is not ideal, as over time other map methods may need similar handling
# But for now, we will just handle lookup specifically
if map_name not in map_sym_tab:
logger.error(f"Map '{map_name}' not found for allocation")
return
if method_name != "lookup":
# Fallback allocation for other map methods
_allocate_for_map_method_fallback(builder, var_name, local_sym_tab)
return
map_params = map_sym_tab[map_name].params
if map_params["type"] != BPFMapType.HASH:
logger.warning(
"Map method lookup used on non-hash map, using fallback allocation"
)
_allocate_for_map_method_fallback(builder, var_name, local_sym_tab)
return
value_type = map_params["value"]
# Determine IR type for value
if isinstance(value_type, str) and value_type in structs_sym_tab:
struct_info = structs_sym_tab[value_type]
value_ir_type = struct_info.ir_type
else:
value_ir_type = ctypes_to_ir(value_type)
if value_ir_type is None:
logger.warning(
f"Could not determine IR type for map value '{value_type}', using fallback allocation"
)
_allocate_for_map_method_fallback(builder, var_name, local_sym_tab)
return
# Main variable (pointer to pointer)
ir_type = ir.PointerType(ir.IntType(64))
var = builder.alloca(ir_type, name=var_name)
local_sym_tab[var_name] = LocalSymbol(var, ir_type, value_type)
# Temporary variable for computed values
tmp_ir_type = value_ir_type
var_tmp = builder.alloca(tmp_ir_type, name=f"{var_name}_tmp")
local_sym_tab[f"{var_name}_tmp"] = LocalSymbol(var_tmp, tmp_ir_type)
logger.info(
f"Pre-allocated {var_name} and {var_name}_tmp for map method lookup of type {value_ir_type}"
)
def _allocate_for_map_method_fallback(builder, var_name, local_sym_tab):
"""Fallback allocation for map method variable (i64* and i64**)."""
# Main variable (pointer to pointer)
ir_type = ir.PointerType(ir.IntType(64))
var = builder.alloca(ir_type, name=var_name)
@ -157,7 +226,9 @@ def _allocate_for_map_method(builder, var_name, local_sym_tab):
var_tmp = builder.alloca(tmp_ir_type, name=f"{var_name}_tmp")
local_sym_tab[f"{var_name}_tmp"] = LocalSymbol(var_tmp, tmp_ir_type)
logger.info(f"Pre-allocated {var_name} and {var_name}_tmp for map method")
logger.info(
f"Pre-allocated {var_name} and {var_name}_tmp for map method (fallback)"
)
def _allocate_for_constant(builder, var_name, rval, local_sym_tab):
@ -199,17 +270,33 @@ def _allocate_for_binop(builder, var_name, local_sym_tab):
logger.info(f"Pre-allocated {var_name} for binop result")
def _get_type_name(ir_type):
"""Get a string representation of an IR type."""
if isinstance(ir_type, ir.IntType):
return f"i{ir_type.width}"
elif isinstance(ir_type, ir.PointerType):
return "ptr"
elif isinstance(ir_type, ir.ArrayType):
return f"[{ir_type.count}x{_get_type_name(ir_type.element)}]"
else:
return str(ir_type).replace(" ", "")
def allocate_temp_pool(builder, max_temps, local_sym_tab):
"""Allocate the temporary scratch space pool for helper arguments."""
if max_temps == 0:
if not max_temps:
logger.info("No temp pool allocation needed")
return
logger.info(f"Allocating temp pool of {max_temps} variables")
for i in range(max_temps):
temp_name = f"__helper_temp_{i}"
temp_var = builder.alloca(ir.IntType(64), name=temp_name)
temp_var.align = 8
local_sym_tab[temp_name] = LocalSymbol(temp_var, ir.IntType(64))
for tmp_type, cnt in max_temps.items():
type_name = _get_type_name(tmp_type)
logger.info(f"Allocating temp pool of {cnt} variables of type {type_name}")
for i in range(cnt):
temp_name = f"__helper_temp_{type_name}_{i}"
temp_var = builder.alloca(tmp_type, name=temp_name)
temp_var.align = _get_alignment(tmp_type)
local_sym_tab[temp_name] = LocalSymbol(temp_var, tmp_type)
logger.debug(f"Allocated temp variable: {temp_name}")
def _allocate_for_name(builder, var_name, rval, local_sym_tab):
@ -248,9 +335,99 @@ def _allocate_for_attribute(builder, var_name, rval, local_sym_tab, structs_sym_
logger.error(f"Struct variable '{struct_var}' not found")
return
struct_type = local_sym_tab[struct_var].metadata
struct_type: type = local_sym_tab[struct_var].metadata
if not struct_type or struct_type not in structs_sym_tab:
logger.error(f"Struct type '{struct_type}' not found")
if VmlinuxHandlerRegistry.is_vmlinux_struct(struct_type.__name__):
# Handle vmlinux struct field access
vmlinux_struct_name = struct_type.__name__
if not VmlinuxHandlerRegistry.has_field(vmlinux_struct_name, field_name):
logger.error(
f"Field '{field_name}' not found in vmlinux struct '{vmlinux_struct_name}'"
)
return
field_type: tuple[ir.GlobalVariable, Field] = (
VmlinuxHandlerRegistry.get_field_type(vmlinux_struct_name, field_name)
)
field_ir, field = field_type
# Determine the actual IR type based on the field's type
actual_ir_type = None
# Check if it's a ctypes primitive
if field.type.__module__ == ctypes.__name__:
try:
field_size_bytes = ctypes.sizeof(field.type)
field_size_bits = field_size_bytes * 8
if field_size_bits in [8, 16, 32, 64]:
# Special case: struct_xdp_md i32 fields should allocate as i64
# because load_ctx_field will zero-extend them to i64
if (
vmlinux_struct_name == "struct_xdp_md"
and field_size_bits == 32
):
actual_ir_type = ir.IntType(64)
logger.info(
f"Allocating {var_name} as i64 for i32 field from struct_xdp_md.{field_name} "
"(will be zero-extended during load)"
)
else:
actual_ir_type = ir.IntType(field_size_bits)
else:
logger.warning(
f"Unusual field size {field_size_bits} bits for {field_name}"
)
actual_ir_type = ir.IntType(64)
except Exception as e:
logger.warning(
f"Could not determine size for ctypes field {field_name}: {e}"
)
actual_ir_type = ir.IntType(64)
field_size_bits = 64
# Check if it's a nested vmlinux struct or complex type
elif field.type.__module__ == "vmlinux":
# For pointers to structs, use pointer type (64-bit)
if field.ctype_complex_type is not None and issubclass(
field.ctype_complex_type, ctypes._Pointer
):
actual_ir_type = ir.IntType(64) # Pointer is always 64-bit
field_size_bits = 64
# For embedded structs, this is more complex - might need different handling
else:
logger.warning(
f"Field {field_name} is a nested vmlinux struct, using i64 for now"
)
actual_ir_type = ir.IntType(64)
field_size_bits = 64
else:
logger.warning(
f"Unknown field type module {field.type.__module__} for {field_name}"
)
actual_ir_type = ir.IntType(64)
field_size_bits = 64
# Pre-allocate the tmp storage used by load_struct_field (so we don't alloca inside handler)
tmp_name = f"{struct_var}_{field_name}_tmp"
tmp_ir_type = ir.IntType(field_size_bits)
tmp_var = builder.alloca(tmp_ir_type, name=tmp_name)
tmp_var.align = tmp_ir_type.width // 8
local_sym_tab[tmp_name] = LocalSymbol(tmp_var, tmp_ir_type)
logger.info(
f"Pre-allocated temp {tmp_name} (i{field_size_bits}) for vmlinux field read {vmlinux_struct_name}.{field_name}"
)
# Allocate with the actual IR type for the destination var
var = _allocate_with_type(builder, var_name, actual_ir_type)
local_sym_tab[var_name] = LocalSymbol(var, actual_ir_type, field)
logger.info(
f"Pre-allocated {var_name} as {actual_ir_type} from vmlinux struct {vmlinux_struct_name}.{field_name}"
)
return
else:
logger.error(f"Struct type '{struct_type}' not found")
return
struct_info = structs_sym_tab[struct_type]

View File

@ -1,8 +1,12 @@
import ast
import logging
from inspect import isclass
from llvmlite import ir
from pythonbpf.expr import eval_expr
from pythonbpf.helper import emit_probe_read_kernel_str_call
from pythonbpf.type_deducer import ctypes_to_ir
from pythonbpf.vmlinux_parser.dependency_node import Field
logger = logging.getLogger(__name__)
@ -146,9 +150,74 @@ def handle_variable_assignment(
return False
val, val_type = val_result
logger.info(f"Evaluated value for {var_name}: {val} of type {val_type}, {var_type}")
logger.info(
f"Evaluated value for {var_name}: {val} of type {val_type}, expected {var_type}"
)
if val_type != var_type:
if isinstance(val_type, ir.IntType) and isinstance(var_type, ir.IntType):
# Handle vmlinux struct pointers - they're represented as Python classes but are i64 pointers
if isclass(val_type) and (val_type.__module__ == "vmlinux"):
logger.info("Handling vmlinux struct pointer assignment")
# vmlinux struct pointers: val is a pointer, need to convert to i64
if isinstance(var_type, ir.IntType) and var_type.width == 64:
# Convert pointer to i64 using ptrtoint
if isinstance(val.type, ir.PointerType):
val = builder.ptrtoint(val, ir.IntType(64))
logger.info(
"Converted vmlinux struct pointer to i64 using ptrtoint"
)
builder.store(val, var_ptr)
logger.info(f"Assigned vmlinux struct pointer to {var_name} (i64)")
return True
else:
logger.error(
f"Type mismatch: vmlinux struct pointer requires i64, got {var_type}"
)
return False
# Handle user-defined struct pointer casts
# val_type is a string (struct name), var_type is a pointer to the struct
if isinstance(val_type, str) and val_type in structs_sym_tab:
struct_info = structs_sym_tab[val_type]
expected_ptr_type = ir.PointerType(struct_info.ir_type)
# Check if var_type matches the expected pointer type
if isinstance(var_type, ir.PointerType) and var_type == expected_ptr_type:
# val is already the correct pointer type from inttoptr/bitcast
builder.store(val, var_ptr)
logger.info(f"Assigned user-defined struct pointer cast to {var_name}")
return True
else:
logger.error(
f"Type mismatch: user-defined struct pointer cast requires pointer type, got {var_type}"
)
return False
if isinstance(val_type, Field):
logger.info("Handling assignment to struct field")
# Special handling for struct_xdp_md i32 fields that are zero-extended to i64
# The load_ctx_field already extended them, so val is i64 but val_type.type shows c_uint
if (
hasattr(val_type, "type")
and val_type.type.__name__ == "c_uint"
and isinstance(var_type, ir.IntType)
and var_type.width == 64
):
# This is the struct_xdp_md case - value is already i64
builder.store(val, var_ptr)
logger.info(
f"Assigned zero-extended struct_xdp_md i32 field to {var_name} (i64)"
)
return True
# TODO: handling only ctype struct fields for now. Handle other stuff too later.
elif var_type == ctypes_to_ir(val_type.type.__name__):
builder.store(val, var_ptr)
logger.info(f"Assigned ctype struct field to {var_name}")
return True
else:
logger.error(
f"Failed to assign ctype struct field to {var_name}: {val_type} != {var_type}"
)
return False
elif isinstance(val_type, ir.IntType) and isinstance(var_type, ir.IntType):
# Allow implicit int widening
if val_type.width < var_type.width:
val = builder.sext(val, var_type)

View File

@ -25,7 +25,7 @@ import re
logger: Logger = logging.getLogger(__name__)
VERSION = "v0.1.6"
VERSION = "v0.1.8"
def finalize_module(original_str):
@ -37,6 +37,24 @@ def finalize_module(original_str):
return re.sub(pattern, replacement, original_str)
def bpf_passthrough_gen(module):
i32_ty = ir.IntType(32)
ptr_ty = ir.PointerType(ir.IntType(8))
fnty = ir.FunctionType(ptr_ty, [i32_ty, ptr_ty])
# Declare the intrinsic
passthrough = ir.Function(module, fnty, "llvm.bpf.passthrough.p0.p0")
# Set function attributes
# TODO: the ones commented are supposed to be there but cannot be added due to llvmlite limitations at the moment
# passthrough.attributes.add("nofree")
# passthrough.attributes.add("nosync")
passthrough.attributes.add("nounwind")
# passthrough.attributes.add("memory(none)")
return passthrough
def find_bpf_chunks(tree):
"""Find all functions decorated with @bpf in the AST."""
bpf_functions = []
@ -57,6 +75,8 @@ def processor(source_code, filename, module):
for func_node in bpf_chunks:
logger.info(f"Found BPF function/struct: {func_node.name}")
bpf_passthrough_gen(module)
vmlinux_symtab = vmlinux_proc(tree, module)
if vmlinux_symtab:
handler = VmlinuxHandler.initialize(vmlinux_symtab)
@ -66,7 +86,7 @@ def processor(source_code, filename, module):
license_processing(tree, module)
globals_processing(tree, module)
structs_sym_tab = structs_proc(tree, module, bpf_chunks)
map_sym_tab = maps_proc(tree, module, bpf_chunks)
map_sym_tab = maps_proc(tree, module, bpf_chunks, structs_sym_tab)
func_proc(tree, module, bpf_chunks, map_sym_tab, structs_sym_tab)
globals_list_creation(tree, module)
@ -137,7 +157,7 @@ def compile_to_ir(filename: str, output: str, loglevel=logging.INFO):
module.add_named_metadata("llvm.ident", [f"PythonBPF {VERSION}"])
module_string = finalize_module(str(module))
module_string: str = finalize_module(str(module))
logger.info(f"IR written to {output}")
with open(output, "w") as f:
@ -198,13 +218,11 @@ def compile(loglevel=logging.WARNING) -> bool:
def BPF(loglevel=logging.WARNING) -> BpfObject:
caller_frame = inspect.stack()[1]
src = inspect.getsource(caller_frame.frame)
with tempfile.NamedTemporaryFile(
mode="w+", delete=True, suffix=".py"
) as f, tempfile.NamedTemporaryFile(
mode="w+", delete=True, suffix=".ll"
) as inter, tempfile.NamedTemporaryFile(
mode="w+", delete=False, suffix=".o"
) as obj_file:
with (
tempfile.NamedTemporaryFile(mode="w+", delete=True, suffix=".py") as f,
tempfile.NamedTemporaryFile(mode="w+", delete=True, suffix=".ll") as inter,
tempfile.NamedTemporaryFile(mode="w+", delete=False, suffix=".o") as obj_file,
):
f.write(src)
f.flush()
source = f.name

View File

@ -49,6 +49,10 @@ class DebugInfoGenerator:
)
return self._type_cache[key]
def get_uint8_type(self) -> Any:
"""Get debug info for signed 8-bit integer"""
return self.get_basic_type("char", 8, dc.DW_ATE_unsigned)
def get_int32_type(self) -> Any:
"""Get debug info for signed 32-bit integer"""
return self.get_basic_type("int", 32, dc.DW_ATE_signed)
@ -184,3 +188,83 @@ class DebugInfoGenerator:
"DIGlobalVariableExpression",
{"var": global_var, "expr": self.module.add_debug_info("DIExpression", {})},
)
def get_int64_type(self):
return self.get_basic_type("long", 64, dc.DW_ATE_signed)
def create_subroutine_type(self, return_type, param_types):
"""
Create a DISubroutineType given return type and list of parameter types.
Equivalent to: !DISubroutineType(types: !{ret, args...})
"""
type_array = [return_type]
if isinstance(param_types, (list, tuple)):
type_array.extend(param_types)
else:
type_array.append(param_types)
return self.module.add_debug_info("DISubroutineType", {"types": type_array})
def create_local_variable_debug_info(
self, name: str, arg: int, var_type: Any
) -> Any:
"""
Create debug info for a local variable (DILocalVariable) without scope.
Example:
!DILocalVariable(name: "ctx", arg: 1, file: !3, line: 20, type: !7)
"""
return self.module.add_debug_info(
"DILocalVariable",
{
"name": name,
"arg": arg,
"file": self.module._file_metadata,
"type": var_type,
},
)
def add_scope_to_local_variable(self, local_variable_debug_info, scope_value):
"""
Add scope information to an existing local variable debug info object.
"""
# TODO: this is a workaround a flaw in the debug info generation. Fix this if possible in the future.
# We should not be touching llvmlite's internals like this.
if hasattr(local_variable_debug_info, "operands"):
# LLVM metadata operands is a tuple, so we need to rebuild it
existing_operands = local_variable_debug_info.operands
# Convert tuple to list, add scope, convert back to tuple
operands_list = list(existing_operands)
operands_list.append(("scope", scope_value))
# Reassign the new tuple
local_variable_debug_info.operands = tuple(operands_list)
def create_subprogram(
self, name: str, subroutine_type: Any, retained_nodes: List[Any]
) -> Any:
"""
Create a DISubprogram for a function.
Args:
name: Function name
subroutine_type: DISubroutineType for the function signature
retained_nodes: List of DILocalVariable nodes for function parameters/variables
Returns:
DISubprogram metadata
"""
return self.module.add_debug_info(
"DISubprogram",
{
"name": name,
"scope": self.module._file_metadata,
"file": self.module._file_metadata,
"type": subroutine_type,
# TODO: the following flags do not exist at the moment in our dwarf constants file. We need to add them.
# "flags": dc.DW_FLAG_Prototyped | dc.DW_FLAG_AllCallsDescribed,
# "spFlags": dc.DW_SPFLAG_Definition | dc.DW_SPFLAG_Optimized,
"unit": self.module._debug_compile_unit,
"retainedNodes": retained_nodes,
},
is_distinct=True,
)

View File

@ -1,6 +1,6 @@
from .expr_pass import eval_expr, handle_expr, get_operand_value
from .type_normalization import convert_to_bool, get_base_type_and_depth
from .ir_ops import deref_to_depth
from .ir_ops import deref_to_depth, access_struct_field
from .call_registry import CallHandlerRegistry
from .vmlinux_registry import VmlinuxHandlerRegistry
@ -10,6 +10,7 @@ __all__ = [
"convert_to_bool",
"get_base_type_and_depth",
"deref_to_depth",
"access_struct_field",
"get_operand_value",
"CallHandlerRegistry",
"VmlinuxHandlerRegistry",

View File

@ -6,13 +6,14 @@ from typing import Dict
from pythonbpf.type_deducer import ctypes_to_ir, is_ctypes
from .call_registry import CallHandlerRegistry
from .ir_ops import deref_to_depth, access_struct_field
from .type_normalization import (
convert_to_bool,
handle_comparator,
get_base_type_and_depth,
deref_to_depth,
)
from .vmlinux_registry import VmlinuxHandlerRegistry
from ..vmlinux_parser.dependency_node import Field
logger: Logger = logging.getLogger(__name__)
@ -60,6 +61,7 @@ def _handle_constant_expr(module, builder, expr: ast.Constant):
def _handle_attribute_expr(
func,
expr: ast.Attribute,
local_sym_tab: Dict,
structs_sym_tab: Dict,
@ -72,20 +74,46 @@ def _handle_attribute_expr(
if var_name in local_sym_tab:
var_ptr, var_type, var_metadata = local_sym_tab[var_name]
logger.info(f"Loading attribute {attr_name} from variable {var_name}")
logger.info(f"Variable type: {var_type}, Variable ptr: {var_ptr}")
metadata = structs_sym_tab[var_metadata]
if attr_name in metadata.fields:
gep = metadata.gep(builder, var_ptr, attr_name)
val = builder.load(gep)
field_type = metadata.field_type(attr_name)
return val, field_type
logger.info(
f"Variable type: {var_type}, Variable ptr: {var_ptr}, Variable Metadata: {var_metadata}"
)
if (
hasattr(var_metadata, "__module__")
and var_metadata.__module__ == "vmlinux"
):
# Try vmlinux handler when var_metadata is not a string, but has a module attribute.
# This has been done to keep everything separate in vmlinux struct handling.
vmlinux_result = VmlinuxHandlerRegistry.handle_attribute(
expr, local_sym_tab, None, builder
)
if vmlinux_result is not None:
return vmlinux_result
else:
raise RuntimeError("Vmlinux struct did not process successfully")
elif isinstance(var_metadata, Field):
logger.error(
f"Cannot access field '{attr_name}' on already-loaded field value '{var_name}'"
)
return None
if var_metadata in structs_sym_tab:
return access_struct_field(
builder,
var_ptr,
var_type,
var_metadata,
expr.attr,
structs_sym_tab,
func,
)
else:
logger.error(f"Struct metadata for '{var_name}' not found")
else:
logger.error(f"Undefined variable '{var_name}' for attribute access")
else:
logger.error("Unsupported attribute base expression type")
# Try vmlinux handler as fallback
vmlinux_result = VmlinuxHandlerRegistry.handle_attribute(
expr, local_sym_tab, None, builder
)
if vmlinux_result is not None:
return vmlinux_result
return None
@ -140,7 +168,11 @@ def get_operand_value(
var_type = var.type
base_type, depth = get_base_type_and_depth(var_type)
logger.info(f"var is {var}, base_type is {base_type}, depth is {depth}")
val = deref_to_depth(func, builder, var, depth)
if depth == 1:
val = builder.load(var)
return val
else:
val = deref_to_depth(func, builder, var, depth)
return val
else:
# Check if it's a vmlinux enum/constant
@ -271,16 +303,45 @@ def _handle_ctypes_call(
call_type = expr.func.id
expected_type = ctypes_to_ir(call_type)
if val[1] != expected_type:
# Extract the actual IR value and type
# val could be (value, ir_type) or (value, Field)
value, val_type = val
# If val_type is a Field object (from vmlinux struct), get the actual IR type of the value
if isinstance(val_type, Field):
# The value is already the correct IR value (potentially zero-extended)
# Get the IR type from the value itself
actual_ir_type = value.type
logger.info(
f"Converting vmlinux field {val_type.name} (IR type: {actual_ir_type}) to {call_type}"
)
else:
actual_ir_type = val_type
if actual_ir_type != expected_type:
# NOTE: We are only considering casting to and from int types for now
if isinstance(val[1], ir.IntType) and isinstance(expected_type, ir.IntType):
if val[1].width < expected_type.width:
val = (builder.sext(val[0], expected_type), expected_type)
if isinstance(actual_ir_type, ir.IntType) and isinstance(
expected_type, ir.IntType
):
if actual_ir_type.width < expected_type.width:
value = builder.sext(value, expected_type)
logger.info(
f"Sign-extended from i{actual_ir_type.width} to i{expected_type.width}"
)
elif actual_ir_type.width > expected_type.width:
value = builder.trunc(value, expected_type)
logger.info(
f"Truncated from i{actual_ir_type.width} to i{expected_type.width}"
)
else:
val = (builder.trunc(val[0], expected_type), expected_type)
# Same width, just use as-is (e.g., both i64)
pass
else:
raise ValueError(f"Type mismatch: expected {expected_type}, got {val[1]}")
return val
raise ValueError(
f"Type mismatch: expected {expected_type}, got {actual_ir_type} (original type: {val_type})"
)
return value, expected_type
def _handle_compare(
@ -487,6 +548,134 @@ def _handle_boolean_op(
return None
# ============================================================================
# Struct casting (including vmlinux struct casting)
# ============================================================================
def _handle_vmlinux_cast(
func,
module,
builder,
expr,
local_sym_tab,
map_sym_tab,
structs_sym_tab=None,
):
# handle expressions such as struct_request(ctx.di) where struct_request is a vmlinux
# struct and ctx.di is a pointer to a struct but is actually represented as a c_uint64
# which needs to be cast to a pointer. This is also a field of another vmlinux struct
"""Handle vmlinux struct cast expressions like struct_request(ctx.di)."""
if len(expr.args) != 1:
logger.info("vmlinux struct cast takes exactly one argument")
return None
# Get the struct name
struct_name = expr.func.id
# Evaluate the argument (e.g., ctx.di which is a c_uint64)
arg_result = eval_expr(
func,
module,
builder,
expr.args[0],
local_sym_tab,
map_sym_tab,
structs_sym_tab,
)
if arg_result is None:
logger.info("Failed to evaluate argument to vmlinux struct cast")
return None
arg_val, arg_type = arg_result
# Get the vmlinux struct type
vmlinux_struct_type = VmlinuxHandlerRegistry.get_struct_type(struct_name)
if vmlinux_struct_type is None:
logger.error(f"Failed to get vmlinux struct type for {struct_name}")
return None
# Cast the integer/value to a pointer to the struct
# If arg_val is an integer type, we need to inttoptr it
ptr_type = ir.PointerType()
# TODO: add a field value type check here
# print(arg_type)
if isinstance(arg_type, Field):
if ctypes_to_ir(arg_type.type.__name__):
# Cast integer to pointer
casted_ptr = builder.inttoptr(arg_val, ptr_type)
else:
logger.error(f"Unsupported type for vmlinux cast: {arg_type}")
return None
else:
casted_ptr = builder.inttoptr(arg_val, ptr_type)
return casted_ptr, vmlinux_struct_type
def _handle_user_defined_struct_cast(
func,
module,
builder,
expr,
local_sym_tab,
map_sym_tab,
structs_sym_tab,
):
"""Handle user-defined struct cast expressions like iphdr(nh).
This casts a pointer/integer value to a pointer to the user-defined struct,
similar to how vmlinux struct casts work but for user-defined @struct types.
"""
if len(expr.args) != 1:
logger.info("User-defined struct cast takes exactly one argument")
return None
# Get the struct name
struct_name = expr.func.id
if struct_name not in structs_sym_tab:
logger.error(f"Struct {struct_name} not found in structs_sym_tab")
return None
struct_info = structs_sym_tab[struct_name]
# Evaluate the argument (e.g.,
# an address/pointer value)
arg_result = eval_expr(
func,
module,
builder,
expr.args[0],
local_sym_tab,
map_sym_tab,
structs_sym_tab,
)
if arg_result is None:
logger.info("Failed to evaluate argument to user-defined struct cast")
return None
arg_val, arg_type = arg_result
# Cast the integer/pointer value to a pointer to the struct type
# The struct pointer type is a pointer to the struct's IR type
struct_ptr_type = ir.PointerType(struct_info.ir_type)
# If arg_val is an integer type (like i64), convert to pointer using inttoptr
if isinstance(arg_val.type, ir.IntType):
casted_ptr = builder.inttoptr(arg_val, struct_ptr_type)
logger.info(f"Cast integer to pointer for struct {struct_name}")
elif isinstance(arg_val.type, ir.PointerType):
# If already a pointer, bitcast to the struct pointer type
casted_ptr = builder.bitcast(arg_val, struct_ptr_type)
logger.info(f"Bitcast pointer to struct pointer for {struct_name}")
else:
logger.error(f"Unsupported type for user-defined struct cast: {arg_val.type}")
return None
return casted_ptr, struct_name
# ============================================================================
# Expression Dispatcher
# ============================================================================
@ -507,6 +696,18 @@ def eval_expr(
elif isinstance(expr, ast.Constant):
return _handle_constant_expr(module, builder, expr)
elif isinstance(expr, ast.Call):
if isinstance(expr.func, ast.Name) and VmlinuxHandlerRegistry.is_vmlinux_struct(
expr.func.id
):
return _handle_vmlinux_cast(
func,
module,
builder,
expr,
local_sym_tab,
map_sym_tab,
structs_sym_tab,
)
if isinstance(expr.func, ast.Name) and expr.func.id == "deref":
return _handle_deref_call(expr, local_sym_tab, builder)
@ -520,6 +721,16 @@ def eval_expr(
map_sym_tab,
structs_sym_tab,
)
if isinstance(expr.func, ast.Name) and (expr.func.id in structs_sym_tab):
return _handle_user_defined_struct_cast(
func,
module,
builder,
expr,
local_sym_tab,
map_sym_tab,
structs_sym_tab,
)
result = CallHandlerRegistry.handle_call(
expr, module, builder, func, local_sym_tab, map_sym_tab, structs_sym_tab
@ -530,7 +741,9 @@ def eval_expr(
logger.warning(f"Unknown call: {ast.dump(expr)}")
return None
elif isinstance(expr, ast.Attribute):
return _handle_attribute_expr(expr, local_sym_tab, structs_sym_tab, builder)
return _handle_attribute_expr(
func, expr, local_sym_tab, structs_sym_tab, builder
)
elif isinstance(expr, ast.BinOp):
return _handle_binary_op(
func,

View File

@ -17,34 +17,100 @@ def deref_to_depth(func, builder, val, target_depth):
# dereference with null check
pointee_type = cur_type.pointee
null_check_block = builder.block
not_null_block = func.append_basic_block(name=f"deref_not_null_{depth}")
merge_block = func.append_basic_block(name=f"deref_merge_{depth}")
null_ptr = ir.Constant(cur_type, None)
is_not_null = builder.icmp_signed("!=", cur_val, null_ptr)
logger.debug(f"Inserted null check for pointer at depth {depth}")
def load_op(builder, ptr):
return builder.load(ptr)
builder.cbranch(is_not_null, not_null_block, merge_block)
builder.position_at_end(not_null_block)
dereferenced_val = builder.load(cur_val)
logger.debug(f"Dereferenced to depth {depth - 1}, type: {pointee_type}")
builder.branch(merge_block)
builder.position_at_end(merge_block)
phi = builder.phi(pointee_type, name=f"deref_result_{depth}")
zero_value = (
ir.Constant(pointee_type, 0)
if isinstance(pointee_type, ir.IntType)
else ir.Constant(pointee_type, None)
cur_val = _null_checked_operation(
func, builder, cur_val, load_op, pointee_type, f"deref_{depth}"
)
phi.add_incoming(zero_value, null_check_block)
phi.add_incoming(dereferenced_val, not_null_block)
# Continue with phi result
cur_val = phi
cur_type = pointee_type
logger.debug(f"Dereferenced to depth {depth}, type: {pointee_type}")
return cur_val
def _null_checked_operation(func, builder, ptr, operation, result_type, name_prefix):
"""
Generic null-checked operation on a pointer.
"""
curr_block = builder.block
not_null_block = func.append_basic_block(name=f"{name_prefix}_not_null")
merge_block = func.append_basic_block(name=f"{name_prefix}_merge")
null_ptr = ir.Constant(ptr.type, None)
is_not_null = builder.icmp_signed("!=", ptr, null_ptr)
builder.cbranch(is_not_null, not_null_block, merge_block)
builder.position_at_end(not_null_block)
result = operation(builder, ptr)
not_null_after = builder.block
builder.branch(merge_block)
builder.position_at_end(merge_block)
phi = builder.phi(result_type, name=f"{name_prefix}_result")
if isinstance(result_type, ir.IntType):
null_val = ir.Constant(result_type, 0)
elif isinstance(result_type, ir.PointerType):
null_val = ir.Constant(result_type, None)
else:
null_val = ir.Constant(result_type, ir.Undefined)
phi.add_incoming(null_val, curr_block)
phi.add_incoming(result, not_null_after)
return phi
def access_struct_field(
builder, var_ptr, var_type, var_metadata, field_name, structs_sym_tab, func=None
):
"""
Access a struct field - automatically returns value or pointer based on field type.
"""
metadata = (
structs_sym_tab.get(var_metadata)
if isinstance(var_metadata, str)
else var_metadata
)
if not metadata or field_name not in metadata.fields:
raise ValueError(f"Field '{field_name}' not found in struct")
field_type = metadata.field_type(field_name)
is_ptr_to_struct = isinstance(var_type, ir.PointerType) and isinstance(
var_metadata, str
)
# Get struct pointer
struct_ptr = builder.load(var_ptr) if is_ptr_to_struct else var_ptr
should_load = not isinstance(field_type, ir.ArrayType)
def field_access_op(builder, ptr):
typed_ptr = builder.bitcast(ptr, metadata.ir_type.as_pointer())
field_ptr = metadata.gep(builder, typed_ptr, field_name)
return builder.load(field_ptr) if should_load else field_ptr
# Handle null check for pointer-to-struct
if is_ptr_to_struct:
if func is None:
raise ValueError("func required for null-safe struct pointer access")
if should_load:
result_type = field_type
else:
result_type = field_type.as_pointer()
result = _null_checked_operation(
func,
builder,
struct_ptr,
field_access_op,
result_type,
f"field_{field_name}",
)
return result, field_type
field_ptr = metadata.gep(builder, struct_ptr, field_name)
result = builder.load(field_ptr) if should_load else field_ptr
return result, field_type

View File

@ -1,5 +1,7 @@
import ast
from pythonbpf.vmlinux_parser.vmlinux_exports_handler import VmlinuxHandler
class VmlinuxHandlerRegistry:
"""Registry for vmlinux handler operations"""
@ -7,7 +9,7 @@ class VmlinuxHandlerRegistry:
_handler = None
@classmethod
def set_handler(cls, handler):
def set_handler(cls, handler: VmlinuxHandler):
"""Set the vmlinux handler"""
cls._handler = handler
@ -37,9 +39,37 @@ class VmlinuxHandlerRegistry:
)
return None
@classmethod
def get_struct_debug_info(cls, name):
if cls._handler is None:
return False
return cls._handler.get_struct_debug_info(name)
@classmethod
def is_vmlinux_struct(cls, name):
"""Check if a name refers to a vmlinux struct"""
if cls._handler is None:
return False
return cls._handler.is_vmlinux_struct(name)
@classmethod
def get_struct_type(cls, name):
"""Try to handle a struct name as vmlinux struct"""
if cls._handler is None:
return None
return cls._handler.get_vmlinux_struct_type(name)
@classmethod
def has_field(cls, vmlinux_struct_name, field_name):
"""Check if a vmlinux struct has a specific field"""
if cls._handler is None:
return False
return cls._handler.has_field(vmlinux_struct_name, field_name)
@classmethod
def get_field_type(cls, vmlinux_struct_name, field_name):
"""Get the type of a field in a vmlinux struct"""
if cls._handler is None:
return None
assert isinstance(cls._handler, VmlinuxHandler)
return cls._handler.get_field_type(vmlinux_struct_name, field_name)

View File

@ -0,0 +1,82 @@
import ast
import llvmlite.ir as ir
import logging
from pythonbpf.debuginfo import DebugInfoGenerator
from pythonbpf.expr import VmlinuxHandlerRegistry
import ctypes
logger = logging.getLogger(__name__)
def generate_function_debug_info(
func_node: ast.FunctionDef, module: ir.Module, func: ir.Function
):
generator = DebugInfoGenerator(module)
leading_argument = func_node.args.args[0]
leading_argument_name = leading_argument.arg
annotation = leading_argument.annotation
if func_node.returns is None:
# TODO: should check if this logic is consistent with function return type handling elsewhere
return_type = ctypes.c_int64()
elif hasattr(func_node.returns, "id"):
return_type = func_node.returns.id
if return_type == "c_int32":
return_type = generator.get_int32_type()
elif return_type == "c_int64":
return_type = generator.get_int64_type()
elif return_type == "c_uint32":
return_type = generator.get_uint32_type()
elif return_type == "c_uint64":
return_type = generator.get_uint64_type()
else:
logger.warning(
"Return type should be int32, int64, uint32 or uint64 only. Falling back to int64"
)
return_type = generator.get_int64_type()
else:
return_type = ctypes.c_int64()
# context processing
if annotation is None:
logger.warning("Type of context of function not found.")
return
if hasattr(annotation, "id"):
ctype_name = annotation.id
if ctype_name == "c_void_p":
return
elif ctype_name.startswith("ctypes"):
raise SyntaxError(
"The first argument should always be a pointer to a struct or a void pointer"
)
context_debug_info = VmlinuxHandlerRegistry.get_struct_debug_info(annotation.id)
# Create pointer to context this must be created fresh for each function
# to avoid circular reference issues when the same struct is used in multiple functions
pointer_to_context_debug_info = generator.create_pointer_type(
context_debug_info, 64
)
# Create subroutine type - also fresh for each function
subroutine_type = generator.create_subroutine_type(
return_type, pointer_to_context_debug_info
)
# Create local variable - fresh for each function with unique name
context_local_variable = generator.create_local_variable_debug_info(
leading_argument_name, 1, pointer_to_context_debug_info
)
retained_nodes = [context_local_variable]
logger.info(f"Generating debug info for function {func_node.name}")
# Create subprogram with is_distinct=True to ensure each function gets unique debug info
subprogram_debug_info = generator.create_subprogram(
func_node.name, subroutine_type, retained_nodes
)
generator.add_scope_to_local_variable(
context_local_variable, subprogram_debug_info
)
func.set_metadata("dbg", subprogram_debug_info)
else:
logger.error(f"Invalid annotation type for argument '{leading_argument_name}'")

View File

@ -7,7 +7,12 @@ from pythonbpf.helper import (
reset_scratch_pool,
)
from pythonbpf.type_deducer import ctypes_to_ir
from pythonbpf.expr import eval_expr, handle_expr, convert_to_bool
from pythonbpf.expr import (
eval_expr,
handle_expr,
convert_to_bool,
VmlinuxHandlerRegistry,
)
from pythonbpf.assign_pass import (
handle_variable_assignment,
handle_struct_field_assignment,
@ -16,8 +21,9 @@ from pythonbpf.allocation_pass import (
handle_assign_allocation,
allocate_temp_pool,
create_targets_and_rvals,
LocalSymbol,
)
from .function_debug_info import generate_function_debug_info
from .return_utils import handle_none_return, handle_xdp_return, is_xdp_name
from .function_metadata import get_probe_string, is_global_function, infer_return_type
@ -33,7 +39,7 @@ logger = logging.getLogger(__name__)
def count_temps_in_call(call_node, local_sym_tab):
"""Count the number of temporary variables needed for a function call."""
count = 0
count = {}
is_helper = False
# NOTE: We exclude print calls for now
@ -43,21 +49,28 @@ def count_temps_in_call(call_node, local_sym_tab):
and call_node.func.id != "print"
):
is_helper = True
func_name = call_node.func.id
elif isinstance(call_node.func, ast.Attribute):
if HelperHandlerRegistry.has_handler(call_node.func.attr):
is_helper = True
func_name = call_node.func.attr
if not is_helper:
return 0
return {} # No temps needed
for arg in call_node.args:
for arg_idx in range(len(call_node.args)):
# NOTE: Count all non-name arguments
# For struct fields, if it is being passed as an argument,
# The struct object should already exist in the local_sym_tab
if not isinstance(arg, ast.Name) and not (
arg = call_node.args[arg_idx]
if isinstance(arg, ast.Name) or (
isinstance(arg, ast.Attribute) and arg.value.id in local_sym_tab
):
count += 1
continue
param_type = HelperHandlerRegistry.get_param_type(func_name, arg_idx)
if isinstance(param_type, ir.PointerType):
pointee_type = param_type.pointee
count[pointee_type] = count.get(pointee_type, 0) + 1
return count
@ -93,11 +106,15 @@ def handle_if_allocation(
def allocate_mem(
module, builder, body, func, ret_type, map_sym_tab, local_sym_tab, structs_sym_tab
):
max_temps_needed = 0
max_temps_needed = {}
def merge_type_counts(count_dict):
nonlocal max_temps_needed
for typ, cnt in count_dict.items():
max_temps_needed[typ] = max(max_temps_needed.get(typ, 0), cnt)
def update_max_temps_for_stmt(stmt):
nonlocal max_temps_needed
temps_needed = 0
if isinstance(stmt, ast.If):
for s in stmt.body:
@ -106,10 +123,13 @@ def allocate_mem(
update_max_temps_for_stmt(s)
return
stmt_temps = {}
for node in ast.walk(stmt):
if isinstance(node, ast.Call):
temps_needed += count_temps_in_call(node, local_sym_tab)
max_temps_needed = max(max_temps_needed, temps_needed)
call_temps = count_temps_in_call(node, local_sym_tab)
for typ, cnt in call_temps.items():
stmt_temps[typ] = stmt_temps.get(typ, 0) + cnt
merge_type_counts(stmt_temps)
for stmt in body:
update_max_temps_for_stmt(stmt)
@ -127,7 +147,9 @@ def allocate_mem(
structs_sym_tab,
)
elif isinstance(stmt, ast.Assign):
handle_assign_allocation(builder, stmt, local_sym_tab, structs_sym_tab)
handle_assign_allocation(
builder, stmt, local_sym_tab, map_sym_tab, structs_sym_tab
)
allocate_temp_pool(builder, max_temps_needed, local_sym_tab)
@ -324,6 +346,28 @@ def process_func_body(
local_sym_tab = {}
# Add the context parameter (first function argument) to the local symbol table
if func_node.args.args and len(func_node.args.args) > 0:
context_arg = func_node.args.args[0]
context_name = context_arg.arg
if hasattr(context_arg, "annotation") and context_arg.annotation:
if isinstance(context_arg.annotation, ast.Name):
context_type_name = context_arg.annotation.id
elif isinstance(context_arg.annotation, ast.Attribute):
context_type_name = context_arg.annotation.attr
else:
raise TypeError(
f"Unsupported annotation type: {ast.dump(context_arg.annotation)}"
)
if VmlinuxHandlerRegistry.is_vmlinux_struct(context_type_name):
resolved_type = VmlinuxHandlerRegistry.get_struct_type(
context_type_name
)
context_type = LocalSymbol(None, None, resolved_type)
local_sym_tab[context_name] = context_type
logger.info(f"Added argument '{context_name}' to local symbol table")
# pre-allocate dynamic variables
local_sym_tab = allocate_mem(
module,
@ -374,7 +418,7 @@ def process_bpf_chunk(func_node, module, return_type, map_sym_tab, structs_sym_t
func.linkage = "dso_local"
func.attributes.add("nounwind")
func.attributes.add("noinline")
func.attributes.add("optnone")
# func.attributes.add("optnone")
if func_node.args.args:
# Only look at the first argument for now
@ -412,7 +456,7 @@ def func_proc(tree, module, chunks, map_sym_tab, structs_sym_tab):
func_type = get_probe_string(func_node)
logger.info(f"Found probe_string of {func_node.name}: {func_type}")
process_bpf_chunk(
func = process_bpf_chunk(
func_node,
module,
ctypes_to_ir(infer_return_type(func_node)),
@ -420,6 +464,9 @@ def func_proc(tree, module, chunks, map_sym_tab, structs_sym_tab):
structs_sym_tab,
)
logger.info(f"Generating Debug Info for Function {func_node.name}")
generate_function_debug_info(func_node, module, func)
# TODO: WIP, for string assignment to fixed-size arrays
def assign_string_to_array(builder, target_array_ptr, source_string_ptr, array_length):

View File

@ -1,7 +1,26 @@
from .helper_registry import HelperHandlerRegistry
from .helper_utils import reset_scratch_pool
from .bpf_helper_handler import handle_helper_call, emit_probe_read_kernel_str_call
from .helpers import ktime, pid, deref, comm, probe_read_str, XDP_DROP, XDP_PASS
from .bpf_helper_handler import (
handle_helper_call,
emit_probe_read_kernel_str_call,
emit_probe_read_kernel_call,
)
from .helpers import (
ktime,
pid,
deref,
comm,
probe_read_str,
random,
probe_read,
smp_processor_id,
uid,
skb_store_bytes,
get_current_cgroup_id,
get_stack,
XDP_DROP,
XDP_PASS,
)
# Register the helper handler with expr module
@ -60,11 +79,19 @@ __all__ = [
"reset_scratch_pool",
"handle_helper_call",
"emit_probe_read_kernel_str_call",
"emit_probe_read_kernel_call",
"get_current_cgroup_id",
"ktime",
"pid",
"deref",
"comm",
"probe_read_str",
"random",
"probe_read",
"smp_processor_id",
"uid",
"skb_store_bytes",
"get_stack",
"XDP_DROP",
"XDP_PASS",
]

View File

@ -8,30 +8,45 @@ from .helper_utils import (
get_flags_val,
get_data_ptr_and_size,
get_buffer_ptr_and_size,
get_char_array_ptr_and_size,
get_ptr_from_arg,
get_int_value_from_arg,
)
from .printk_formatter import simple_string_print, handle_fstring_print
from logging import Logger
from pythonbpf.maps import BPFMapType
import logging
logger: Logger = logging.getLogger(__name__)
logger = logging.getLogger(__name__)
class BPFHelperID(Enum):
BPF_MAP_LOOKUP_ELEM = 1
BPF_MAP_UPDATE_ELEM = 2
BPF_MAP_DELETE_ELEM = 3
BPF_PROBE_READ = 4
BPF_KTIME_GET_NS = 5
BPF_PRINTK = 6
BPF_GET_PRANDOM_U32 = 7
BPF_GET_SMP_PROCESSOR_ID = 8
BPF_SKB_STORE_BYTES = 9
BPF_GET_CURRENT_PID_TGID = 14
BPF_GET_CURRENT_UID_GID = 15
BPF_GET_CURRENT_CGROUP_ID = 80
BPF_GET_CURRENT_COMM = 16
BPF_PERF_EVENT_OUTPUT = 25
BPF_GET_STACK = 67
BPF_PROBE_READ_KERNEL_STR = 115
BPF_PROBE_READ_KERNEL = 113
BPF_RINGBUF_OUTPUT = 130
BPF_RINGBUF_RESERVE = 131
BPF_RINGBUF_SUBMIT = 132
BPF_RINGBUF_DISCARD = 133
@HelperHandlerRegistry.register("ktime")
@HelperHandlerRegistry.register(
"ktime",
param_types=[],
return_type=ir.IntType(64),
)
def bpf_ktime_get_ns_emitter(
call,
map_ptr,
@ -54,7 +69,38 @@ def bpf_ktime_get_ns_emitter(
return result, ir.IntType(64)
@HelperHandlerRegistry.register("lookup")
@HelperHandlerRegistry.register(
"get_current_cgroup_id",
param_types=[],
return_type=ir.IntType(64),
)
def bpf_get_current_cgroup_id(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_get_current_cgroup_id helper function call.
"""
# func is an arg to just have a uniform signature with other emitters
helper_id = ir.Constant(ir.IntType(64), BPFHelperID.BPF_GET_CURRENT_CGROUP_ID.value)
fn_type = ir.FunctionType(ir.IntType(64), [], var_arg=False)
fn_ptr_type = ir.PointerType(fn_type)
fn_ptr = builder.inttoptr(helper_id, fn_ptr_type)
result = builder.call(fn_ptr, [], tail=False)
return result, ir.IntType(64)
@HelperHandlerRegistry.register(
"lookup",
param_types=[ir.PointerType(ir.IntType(64))],
return_type=ir.PointerType(ir.IntType(64)),
)
def bpf_map_lookup_elem_emitter(
call,
map_ptr,
@ -78,9 +124,9 @@ def bpf_map_lookup_elem_emitter(
map_void_ptr = builder.bitcast(map_ptr, ir.PointerType())
# TODO: I have changed the return type to i64*, as we are
# allocating space for that type in allocate_mem. This is
# temporary, and we will honour other widths later. But this
# allows us to have cool binary ops on the returned value.
# allocating space for that type in allocate_mem. This is
# temporary, and we will honour other widths later. But this
# allows us to have cool binary ops on the returned value.
fn_type = ir.FunctionType(
ir.PointerType(ir.IntType(64)), # Return type: void*
[ir.PointerType(), ir.PointerType()], # Args: (void*, void*)
@ -96,6 +142,7 @@ def bpf_map_lookup_elem_emitter(
return result, ir.PointerType()
# NOTE: This has special handling so we won't reflect the signature here.
@HelperHandlerRegistry.register("print")
def bpf_printk_emitter(
call,
@ -144,7 +191,15 @@ def bpf_printk_emitter(
return True
@HelperHandlerRegistry.register("update")
@HelperHandlerRegistry.register(
"update",
param_types=[
ir.PointerType(ir.IntType(64)),
ir.PointerType(ir.IntType(64)),
ir.IntType(64),
],
return_type=ir.PointerType(ir.IntType(64)),
)
def bpf_map_update_elem_emitter(
call,
map_ptr,
@ -199,7 +254,11 @@ def bpf_map_update_elem_emitter(
return result, None
@HelperHandlerRegistry.register("delete")
@HelperHandlerRegistry.register(
"delete",
param_types=[ir.PointerType(ir.IntType(64))],
return_type=ir.PointerType(ir.IntType(64)),
)
def bpf_map_delete_elem_emitter(
call,
map_ptr,
@ -239,7 +298,11 @@ def bpf_map_delete_elem_emitter(
return result, None
@HelperHandlerRegistry.register("comm")
@HelperHandlerRegistry.register(
"comm",
param_types=[ir.PointerType(ir.IntType(8))],
return_type=ir.IntType(64),
)
def bpf_get_current_comm_emitter(
call,
map_ptr,
@ -296,7 +359,11 @@ def bpf_get_current_comm_emitter(
return result, None
@HelperHandlerRegistry.register("pid")
@HelperHandlerRegistry.register(
"pid",
param_types=[],
return_type=ir.IntType(64),
)
def bpf_get_current_pid_tgid_emitter(
call,
map_ptr,
@ -318,12 +385,12 @@ def bpf_get_current_pid_tgid_emitter(
result = builder.call(fn_ptr, [], tail=False)
# Extract the lower 32 bits (PID) using bitwise AND with 0xFFFFFFFF
# TODO: return both PID and TGID if we end up needing TGID somewhere
mask = ir.Constant(ir.IntType(64), 0xFFFFFFFF)
pid = builder.and_(result, mask)
return pid, ir.IntType(64)
@HelperHandlerRegistry.register("output")
def bpf_perf_event_output_handler(
call,
map_ptr,
@ -334,6 +401,10 @@ def bpf_perf_event_output_handler(
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_perf_event_output helper function call.
"""
if len(call.args) != 1:
raise ValueError(
f"Perf event output expects exactly one argument, got {len(call.args)}"
@ -371,6 +442,98 @@ def bpf_perf_event_output_handler(
return result, None
def bpf_ringbuf_output_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_ringbuf_output helper function call.
"""
if len(call.args) != 1:
raise ValueError(
f"Ringbuf output expects exactly one argument, got {len(call.args)}"
)
data_arg = call.args[0]
data_ptr, size_val = get_data_ptr_and_size(data_arg, local_sym_tab, struct_sym_tab)
flags_val = ir.Constant(ir.IntType(64), 0)
map_void_ptr = builder.bitcast(map_ptr, ir.PointerType())
data_void_ptr = builder.bitcast(data_ptr, ir.PointerType())
fn_type = ir.FunctionType(
ir.IntType(64),
[
ir.PointerType(),
ir.PointerType(),
ir.IntType(64),
ir.IntType(64),
],
var_arg=False,
)
fn_ptr_type = ir.PointerType(fn_type)
# helper id
fn_addr = ir.Constant(ir.IntType(64), BPFHelperID.BPF_RINGBUF_OUTPUT.value)
fn_ptr = builder.inttoptr(fn_addr, fn_ptr_type)
result = builder.call(
fn_ptr, [map_void_ptr, data_void_ptr, size_val, flags_val], tail=False
)
return result, None
@HelperHandlerRegistry.register(
"output",
param_types=[ir.PointerType(ir.IntType(8))],
return_type=ir.IntType(64),
)
def handle_output_helper(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Route output helper to the appropriate emitter based on map type.
"""
match map_sym_tab[map_ptr.name].type:
case BPFMapType.PERF_EVENT_ARRAY:
return bpf_perf_event_output_handler(
call,
map_ptr,
module,
builder,
func,
local_sym_tab,
struct_sym_tab,
map_sym_tab,
)
case BPFMapType.RINGBUF:
return bpf_ringbuf_output_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab,
struct_sym_tab,
map_sym_tab,
)
case _:
logger.error("Unsupported map type for output helper.")
raise NotImplementedError("Output helper for this map type is not implemented.")
def emit_probe_read_kernel_str_call(builder, dst_ptr, dst_size, src_ptr):
"""Emit LLVM IR call to bpf_probe_read_kernel_str"""
@ -398,7 +561,14 @@ def emit_probe_read_kernel_str_call(builder, dst_ptr, dst_size, src_ptr):
return result
@HelperHandlerRegistry.register("probe_read_str")
@HelperHandlerRegistry.register(
"probe_read_str",
param_types=[
ir.PointerType(ir.IntType(8)),
ir.PointerType(ir.IntType(8)),
],
return_type=ir.IntType(64),
)
def bpf_probe_read_kernel_str_emitter(
call,
map_ptr,
@ -417,8 +587,8 @@ def bpf_probe_read_kernel_str_emitter(
)
# Get destination buffer (char array -> i8*)
dst_ptr, dst_size = get_char_array_ptr_and_size(
call.args[0], builder, local_sym_tab, struct_sym_tab
dst_ptr, dst_size = get_or_create_ptr_from_arg(
func, module, call.args[0], builder, local_sym_tab, map_sym_tab, struct_sym_tab
)
# Get source pointer (evaluate expression)
@ -433,6 +603,499 @@ def bpf_probe_read_kernel_str_emitter(
return result, ir.IntType(64)
def emit_probe_read_kernel_call(builder, dst_ptr, dst_size, src_ptr):
"""Emit LLVM IR call to bpf_probe_read_kernel"""
fn_type = ir.FunctionType(
ir.IntType(64),
[ir.PointerType(), ir.IntType(32), ir.PointerType()],
var_arg=False,
)
fn_ptr = builder.inttoptr(
ir.Constant(ir.IntType(64), BPFHelperID.BPF_PROBE_READ_KERNEL.value),
ir.PointerType(fn_type),
)
result = builder.call(
fn_ptr,
[
builder.bitcast(dst_ptr, ir.PointerType()),
ir.Constant(ir.IntType(32), dst_size),
builder.bitcast(src_ptr, ir.PointerType()),
],
tail=False,
)
logger.info(f"Emitted bpf_probe_read_kernel (size={dst_size})")
return result
@HelperHandlerRegistry.register(
"probe_read_kernel",
param_types=[
ir.PointerType(ir.IntType(8)),
ir.PointerType(ir.IntType(8)),
],
return_type=ir.IntType(64),
)
def bpf_probe_read_kernel_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""Emit LLVM IR for bpf_probe_read_kernel helper."""
if len(call.args) != 2:
raise ValueError(
f"probe_read_kernel expects 2 args (dst, src), got {len(call.args)}"
)
# Get destination buffer (char array -> i8*)
dst_ptr, dst_size = get_or_create_ptr_from_arg(
func, module, call.args[0], builder, local_sym_tab, map_sym_tab, struct_sym_tab
)
# Get source pointer (evaluate expression)
src_ptr, src_type = get_ptr_from_arg(
call.args[1], func, module, builder, local_sym_tab, map_sym_tab, struct_sym_tab
)
# Emit the helper call
result = emit_probe_read_kernel_call(builder, dst_ptr, dst_size, src_ptr)
logger.info(f"Emitted bpf_probe_read_kernel (size={dst_size})")
return result, ir.IntType(64)
@HelperHandlerRegistry.register(
"random",
param_types=[],
return_type=ir.IntType(32),
)
def bpf_get_prandom_u32_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_get_prandom_u32 helper function call.
"""
helper_id = ir.Constant(ir.IntType(64), BPFHelperID.BPF_GET_PRANDOM_U32.value)
fn_type = ir.FunctionType(ir.IntType(32), [], var_arg=False)
fn_ptr_type = ir.PointerType(fn_type)
fn_ptr = builder.inttoptr(helper_id, fn_ptr_type)
result = builder.call(fn_ptr, [], tail=False)
return result, ir.IntType(32)
@HelperHandlerRegistry.register(
"probe_read",
param_types=[
ir.PointerType(ir.IntType(8)),
ir.IntType(32),
ir.PointerType(ir.IntType(8)),
],
return_type=ir.IntType(64),
)
def bpf_probe_read_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_probe_read helper function
"""
if len(call.args) != 3:
logger.warn("Expected 3 args for probe_read helper")
return
dst_ptr = get_or_create_ptr_from_arg(
func,
module,
call.args[0],
builder,
local_sym_tab,
map_sym_tab,
struct_sym_tab,
ir.IntType(8),
)
size_val = get_int_value_from_arg(
call.args[1],
func,
module,
builder,
local_sym_tab,
map_sym_tab,
struct_sym_tab,
)
src_ptr = get_or_create_ptr_from_arg(
func,
module,
call.args[2],
builder,
local_sym_tab,
map_sym_tab,
struct_sym_tab,
ir.IntType(8),
)
fn_type = ir.FunctionType(
ir.IntType(64),
[ir.PointerType(), ir.IntType(32), ir.PointerType()],
var_arg=False,
)
fn_ptr = builder.inttoptr(
ir.Constant(ir.IntType(64), BPFHelperID.BPF_PROBE_READ.value),
ir.PointerType(fn_type),
)
result = builder.call(
fn_ptr,
[
builder.bitcast(dst_ptr, ir.PointerType()),
builder.trunc(size_val, ir.IntType(32)),
builder.bitcast(src_ptr, ir.PointerType()),
],
tail=False,
)
logger.info(f"Emitted bpf_probe_read (size={size_val})")
return result, ir.IntType(64)
@HelperHandlerRegistry.register(
"smp_processor_id",
param_types=[],
return_type=ir.IntType(32),
)
def bpf_get_smp_processor_id_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_get_smp_processor_id helper function call.
"""
helper_id = ir.Constant(ir.IntType(64), BPFHelperID.BPF_GET_SMP_PROCESSOR_ID.value)
fn_type = ir.FunctionType(ir.IntType(32), [], var_arg=False)
fn_ptr_type = ir.PointerType(fn_type)
fn_ptr = builder.inttoptr(helper_id, fn_ptr_type)
result = builder.call(fn_ptr, [], tail=False)
logger.info("Emitted bpf_get_smp_processor_id call")
return result, ir.IntType(32)
@HelperHandlerRegistry.register(
"uid",
param_types=[],
return_type=ir.IntType(64),
)
def bpf_get_current_uid_gid_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_get_current_uid_gid helper function call.
"""
helper_id = ir.Constant(ir.IntType(64), BPFHelperID.BPF_GET_CURRENT_UID_GID.value)
fn_type = ir.FunctionType(ir.IntType(64), [], var_arg=False)
fn_ptr_type = ir.PointerType(fn_type)
fn_ptr = builder.inttoptr(helper_id, fn_ptr_type)
result = builder.call(fn_ptr, [], tail=False)
# Extract the lower 32 bits (UID) using bitwise AND with 0xFFFFFFFF
# TODO: return both UID and GID if we end up needing GID somewhere
mask = ir.Constant(ir.IntType(64), 0xFFFFFFFF)
pid = builder.and_(result, mask)
return pid, ir.IntType(64)
@HelperHandlerRegistry.register(
"skb_store_bytes",
param_types=[
ir.IntType(32),
ir.PointerType(ir.IntType(8)),
ir.IntType(32),
ir.IntType(64),
],
return_type=ir.IntType(64),
)
def bpf_skb_store_bytes_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_skb_store_bytes helper function call.
Expected call signature: skb_store_bytes(skb, offset, from, len, flags)
"""
args_signature = [
ir.PointerType(), # skb pointer
ir.IntType(32), # offset
ir.PointerType(), # from
ir.IntType(32), # len
ir.IntType(64), # flags
]
if len(call.args) not in (3, 4):
raise ValueError(
f"skb_store_bytes expects 3 or 4 args (offset, from, len, flags), got {len(call.args)}"
)
skb_ptr = func.args[0] # First argument to the function is skb
offset_val = get_int_value_from_arg(
call.args[0],
func,
module,
builder,
local_sym_tab,
map_sym_tab,
struct_sym_tab,
)
from_ptr = get_or_create_ptr_from_arg(
func,
module,
call.args[1],
builder,
local_sym_tab,
map_sym_tab,
struct_sym_tab,
args_signature[2],
)
len_val = get_int_value_from_arg(
call.args[2],
func,
module,
builder,
local_sym_tab,
map_sym_tab,
struct_sym_tab,
)
if len(call.args) == 4:
flags_val = get_flags_val(call.args[3], builder, local_sym_tab)
else:
flags_val = 0
if isinstance(flags_val, int):
flags = ir.Constant(ir.IntType(64), flags_val)
else:
flags = flags_val
fn_type = ir.FunctionType(
ir.IntType(64),
args_signature,
var_arg=False,
)
fn_ptr = builder.inttoptr(
ir.Constant(ir.IntType(64), BPFHelperID.BPF_SKB_STORE_BYTES.value),
ir.PointerType(fn_type),
)
result = builder.call(
fn_ptr,
[
builder.bitcast(skb_ptr, ir.PointerType()),
builder.trunc(offset_val, ir.IntType(32)),
builder.bitcast(from_ptr, ir.PointerType()),
builder.trunc(len_val, ir.IntType(32)),
flags,
],
tail=False,
)
logger.info("Emitted bpf_skb_store_bytes call")
return result, ir.IntType(64)
@HelperHandlerRegistry.register(
"reserve",
param_types=[ir.IntType(64)],
return_type=ir.PointerType(ir.IntType(8)),
)
def bpf_ringbuf_reserve_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_ringbuf_reserve helper function call.
Expected call signature: ringbuf.reserve(size)
"""
if len(call.args) != 1:
raise ValueError(
f"ringbuf.reserve expects exactly one argument (size), got {len(call.args)}"
)
size_val = get_int_value_from_arg(
call.args[0],
func,
module,
builder,
local_sym_tab,
map_sym_tab,
struct_sym_tab,
)
map_void_ptr = builder.bitcast(map_ptr, ir.PointerType())
fn_type = ir.FunctionType(
ir.PointerType(ir.IntType(8)),
[ir.PointerType(), ir.IntType(64)],
var_arg=False,
)
fn_ptr_type = ir.PointerType(fn_type)
fn_addr = ir.Constant(ir.IntType(64), BPFHelperID.BPF_RINGBUF_RESERVE.value)
fn_ptr = builder.inttoptr(fn_addr, fn_ptr_type)
result = builder.call(fn_ptr, [map_void_ptr, size_val], tail=False)
return result, ir.PointerType(ir.IntType(8))
@HelperHandlerRegistry.register(
"submit",
param_types=[ir.PointerType(ir.IntType(8)), ir.IntType(64)],
return_type=ir.VoidType(),
)
def bpf_ringbuf_submit_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_ringbuf_submit helper function call.
Expected call signature: ringbuf.submit(data, flags=0)
"""
if len(call.args) not in (1, 2):
raise ValueError(
f"ringbuf.submit expects 1 or 2 args (data, flags), got {len(call.args)}"
)
data_arg = call.args[0]
flags_arg = call.args[1] if len(call.args) == 2 else None
data_ptr = get_or_create_ptr_from_arg(
func,
module,
data_arg,
builder,
local_sym_tab,
map_sym_tab,
struct_sym_tab,
ir.PointerType(ir.IntType(8)),
)
flags_const = get_flags_val(flags_arg, builder, local_sym_tab)
if isinstance(flags_const, int):
flags_const = ir.Constant(ir.IntType(64), flags_const)
map_void_ptr = builder.bitcast(map_ptr, ir.PointerType())
fn_type = ir.FunctionType(
ir.VoidType(),
[ir.PointerType(), ir.PointerType(), ir.IntType(64)],
var_arg=False,
)
fn_ptr_type = ir.PointerType(fn_type)
fn_addr = ir.Constant(ir.IntType(64), BPFHelperID.BPF_RINGBUF_SUBMIT.value)
fn_ptr = builder.inttoptr(fn_addr, fn_ptr_type)
result = builder.call(fn_ptr, [map_void_ptr, data_ptr, flags_const], tail=False)
return result, None
@HelperHandlerRegistry.register(
"get_stack",
param_types=[ir.PointerType(ir.IntType(8)), ir.IntType(64)],
return_type=ir.IntType(64),
)
def bpf_get_stack_emitter(
call,
map_ptr,
module,
builder,
func,
local_sym_tab=None,
struct_sym_tab=None,
map_sym_tab=None,
):
"""
Emit LLVM IR for bpf_get_stack helper function call.
"""
if len(call.args) not in (1, 2):
raise ValueError(
f"get_stack expects atmost two arguments (buf, flags), got {len(call.args)}"
)
ctx_ptr = func.args[0] # First argument to the function is ctx
buf_arg = call.args[0]
flags_arg = call.args[1] if len(call.args) == 2 else None
buf_ptr, buf_size = get_buffer_ptr_and_size(
buf_arg, builder, local_sym_tab, struct_sym_tab
)
flags_val = get_flags_val(flags_arg, builder, local_sym_tab)
if isinstance(flags_val, int):
flags_val = ir.Constant(ir.IntType(64), flags_val)
buf_void_ptr = builder.bitcast(buf_ptr, ir.PointerType())
fn_type = ir.FunctionType(
ir.IntType(64),
[
ir.PointerType(ir.IntType(8)),
ir.PointerType(),
ir.IntType(64),
ir.IntType(64),
],
var_arg=False,
)
fn_ptr_type = ir.PointerType(fn_type)
fn_addr = ir.Constant(ir.IntType(64), BPFHelperID.BPF_GET_STACK.value)
fn_ptr = builder.inttoptr(fn_addr, fn_ptr_type)
result = builder.call(
fn_ptr,
[ctx_ptr, buf_void_ptr, ir.Constant(ir.IntType(64), buf_size), flags_val],
tail=False,
)
return result, ir.IntType(64)
def handle_helper_call(
call,
module,
@ -487,6 +1150,6 @@ def handle_helper_call(
if not map_sym_tab or map_name not in map_sym_tab:
raise ValueError(f"Map '{map_name}' not found in symbol table")
return invoke_helper(method_name, map_sym_tab[map_name])
return invoke_helper(method_name, map_sym_tab[map_name].sym)
return None

View File

@ -1,17 +1,31 @@
from dataclasses import dataclass
from llvmlite import ir
from typing import Callable
@dataclass
class HelperSignature:
"""Signature of a BPF helper function"""
arg_types: list[ir.Type]
return_type: ir.Type
func: Callable
class HelperHandlerRegistry:
"""Registry for BPF helpers"""
_handlers: dict[str, Callable] = {}
_handlers: dict[str, HelperSignature] = {}
@classmethod
def register(cls, helper_name):
def register(cls, helper_name, param_types=None, return_type=None):
"""Decorator to register a handler function for a helper"""
def decorator(func):
cls._handlers[helper_name] = func
helper_sig = HelperSignature(
arg_types=param_types, return_type=return_type, func=func
)
cls._handlers[helper_name] = helper_sig
return func
return decorator
@ -19,9 +33,29 @@ class HelperHandlerRegistry:
@classmethod
def get_handler(cls, helper_name):
"""Get the handler function for a helper"""
return cls._handlers.get(helper_name)
handler = cls._handlers.get(helper_name)
return handler.func if handler else None
@classmethod
def has_handler(cls, helper_name):
"""Check if a handler function is registered for a helper"""
return helper_name in cls._handlers
@classmethod
def get_signature(cls, helper_name):
"""Get the signature of a helper function"""
return cls._handlers.get(helper_name)
@classmethod
def get_param_type(cls, helper_name, index):
"""Get the type of a parameter of a helper function by the index"""
signature = cls.get_signature(helper_name)
if signature and signature.arg_types and 0 <= index < len(signature.arg_types):
return signature.arg_types[index]
return None
@classmethod
def get_return_type(cls, helper_name):
"""Get the return type of a helper function"""
signature = cls.get_signature(helper_name)
return signature.return_type if signature else None

View File

@ -5,6 +5,7 @@ from llvmlite import ir
from pythonbpf.expr import (
get_operand_value,
eval_expr,
access_struct_field,
)
logger = logging.getLogger(__name__)
@ -14,26 +15,43 @@ class ScratchPoolManager:
"""Manage the temporary helper variables in local_sym_tab"""
def __init__(self):
self._counter = 0
self._counters = {}
@property
def counter(self):
return self._counter
return sum(self._counters.values())
def reset(self):
self._counter = 0
self._counters.clear()
logger.debug("Scratch pool counter reset to 0")
def get_next_temp(self, local_sym_tab):
temp_name = f"__helper_temp_{self._counter}"
self._counter += 1
def _get_type_name(self, ir_type):
if isinstance(ir_type, ir.PointerType):
return "ptr"
elif isinstance(ir_type, ir.IntType):
return f"i{ir_type.width}"
elif isinstance(ir_type, ir.ArrayType):
return f"[{ir_type.count}x{self._get_type_name(ir_type.element)}]"
else:
return str(ir_type).replace(" ", "")
def get_next_temp(self, local_sym_tab, expected_type=None):
# Default to i64 if no expected type provided
type_name = self._get_type_name(expected_type) if expected_type else "i64"
if type_name not in self._counters:
self._counters[type_name] = 0
counter = self._counters[type_name]
temp_name = f"__helper_temp_{type_name}_{counter}"
self._counters[type_name] += 1
if temp_name not in local_sym_tab:
raise ValueError(
f"Scratch pool exhausted or inadequate: {temp_name}. "
f"Current counter: {self._counter}"
f"Type: {type_name} Counter: {counter}"
)
logger.debug(f"Using {temp_name} for type {type_name}")
return local_sym_tab[temp_name].var, temp_name
@ -60,24 +78,73 @@ def get_var_ptr_from_name(var_name, local_sym_tab):
def create_int_constant_ptr(value, builder, local_sym_tab, int_width=64):
"""Create a pointer to an integer constant."""
# Default to 64-bit integer
ptr, temp_name = _temp_pool_manager.get_next_temp(local_sym_tab)
int_type = ir.IntType(int_width)
ptr, temp_name = _temp_pool_manager.get_next_temp(local_sym_tab, int_type)
logger.info(f"Using temp variable '{temp_name}' for int constant {value}")
const_val = ir.Constant(ir.IntType(int_width), value)
const_val = ir.Constant(int_type, value)
builder.store(const_val, ptr)
return ptr
def get_or_create_ptr_from_arg(
func, module, arg, builder, local_sym_tab, map_sym_tab, struct_sym_tab=None
func,
module,
arg,
builder,
local_sym_tab,
map_sym_tab,
struct_sym_tab=None,
expected_type=None,
):
"""Extract or create pointer from the call arguments."""
logger.info(f"Getting pointer from arg: {ast.dump(arg)}")
sz = None
if isinstance(arg, ast.Name):
# Stack space is already allocated
ptr = get_var_ptr_from_name(arg.id, local_sym_tab)
elif isinstance(arg, ast.Constant) and isinstance(arg.value, int):
ptr = create_int_constant_ptr(arg.value, builder, local_sym_tab)
int_width = 64 # Default to i64
if expected_type and isinstance(expected_type, ir.IntType):
int_width = expected_type.width
ptr = create_int_constant_ptr(arg.value, builder, local_sym_tab, int_width)
elif isinstance(arg, ast.Attribute):
# A struct field
struct_name = arg.value.id
field_name = arg.attr
if not local_sym_tab or struct_name not in local_sym_tab:
raise ValueError(f"Struct '{struct_name}' not found")
struct_type = local_sym_tab[struct_name].metadata
if not struct_sym_tab or struct_type not in struct_sym_tab:
raise ValueError(f"Struct type '{struct_type}' not found")
struct_info = struct_sym_tab[struct_type]
if field_name not in struct_info.fields:
raise ValueError(
f"Field '{field_name}' not found in struct '{struct_name}'"
)
field_type = struct_info.field_type(field_name)
struct_ptr = local_sym_tab[struct_name].var
# Special handling for char arrays
if (
isinstance(field_type, ir.ArrayType)
and isinstance(field_type.element, ir.IntType)
and field_type.element.width == 8
):
ptr, sz = get_char_array_ptr_and_size(
arg, builder, local_sym_tab, struct_sym_tab, func
)
if not ptr:
raise ValueError("Failed to get char array pointer from struct field")
else:
ptr = struct_info.gep(builder, struct_ptr, field_name)
else:
# NOTE: For any integer expression reaching this branch, it is probably a struct field or a binop
# Evaluate the expression and store the result in a temp variable
val = get_operand_value(
func, module, arg, builder, local_sym_tab, map_sym_tab, struct_sym_tab
@ -85,13 +152,20 @@ def get_or_create_ptr_from_arg(
if val is None:
raise ValueError("Failed to evaluate expression for helper arg.")
# NOTE: We assume the result is an int64 for now
# if isinstance(arg, ast.Attribute):
# return val
ptr, temp_name = _temp_pool_manager.get_next_temp(local_sym_tab)
ptr, temp_name = _temp_pool_manager.get_next_temp(local_sym_tab, expected_type)
logger.info(f"Using temp variable '{temp_name}' for expression result")
if (
isinstance(val.type, ir.IntType)
and expected_type
and val.type.width > expected_type.width
):
val = builder.trunc(val, expected_type)
builder.store(val, ptr)
# NOTE: For char arrays, also return size
if sz:
return ptr, sz
return ptr
@ -193,7 +267,9 @@ def get_buffer_ptr_and_size(buf_arg, builder, local_sym_tab, struct_sym_tab):
)
def get_char_array_ptr_and_size(buf_arg, builder, local_sym_tab, struct_sym_tab):
def get_char_array_ptr_and_size(
buf_arg, builder, local_sym_tab, struct_sym_tab, func=None
):
"""Get pointer to char array and its size."""
# Struct field: obj.field
@ -204,20 +280,39 @@ def get_char_array_ptr_and_size(buf_arg, builder, local_sym_tab, struct_sym_tab)
if not (local_sym_tab and var_name in local_sym_tab):
raise ValueError(f"Variable '{var_name}' not found")
struct_type = local_sym_tab[var_name].metadata
if not (struct_sym_tab and struct_type in struct_sym_tab):
raise ValueError(f"Struct type '{struct_type}' not found")
struct_ptr, struct_type, struct_metadata = local_sym_tab[var_name]
if not (struct_sym_tab and struct_metadata in struct_sym_tab):
raise ValueError(f"Struct type '{struct_metadata}' not found")
struct_info = struct_sym_tab[struct_type]
struct_info = struct_sym_tab[struct_metadata]
if field_name not in struct_info.fields:
raise ValueError(f"Field '{field_name}' not found")
field_type = struct_info.field_type(field_name)
if not _is_char_array(field_type):
raise ValueError("Expected char array field")
logger.info(
"Field is not a char array, falling back to int or ptr detection"
)
return None, 0
struct_ptr = local_sym_tab[var_name].var
field_ptr = struct_info.gep(builder, struct_ptr, field_name)
# Check if char array
if not (
isinstance(field_type, ir.ArrayType)
and isinstance(field_type.element, ir.IntType)
and field_type.element.width == 8
):
logger.warning("Field is not a char array")
return None, 0
field_ptr, _ = access_struct_field(
builder,
struct_ptr,
struct_type,
struct_metadata,
field_name,
struct_sym_tab,
func,
)
# GEP to first element: [N x i8]* -> i8*
buf_ptr = builder.gep(
@ -274,3 +369,23 @@ def get_ptr_from_arg(
raise ValueError(f"Expected pointer type, got {val_type}")
return val, val_type
def get_int_value_from_arg(
arg, func, module, builder, local_sym_tab, map_sym_tab, struct_sym_tab
):
"""Evaluate argument and return integer value"""
result = eval_expr(
func, module, builder, arg, local_sym_tab, map_sym_tab, struct_sym_tab
)
if not result:
raise ValueError("Failed to evaluate argument")
val, val_type = result
if not isinstance(val_type, ir.IntType):
raise ValueError(f"Expected integer type, got {val_type}")
return val

View File

@ -27,6 +27,41 @@ def probe_read_str(dst, src):
return ctypes.c_int64(0)
def random():
"""get a pseudorandom u32 number"""
return ctypes.c_int32(0)
def probe_read(dst, size, src):
"""Safely read data from kernel memory"""
return ctypes.c_int64(0)
def smp_processor_id():
"""get the current CPU id"""
return ctypes.c_int32(0)
def uid():
"""get current user id"""
return ctypes.c_int32(0)
def skb_store_bytes(offset, from_buf, size, flags=0):
"""store bytes into a socket buffer"""
return ctypes.c_int64(0)
def get_stack(buf, flags=0):
"""get the current stack trace"""
return ctypes.c_int64(0)
def get_current_cgroup_id():
"""Get the current cgroup ID"""
return ctypes.c_int64(0)
XDP_ABORTED = ctypes.c_int64(0)
XDP_DROP = ctypes.c_int64(1)
XDP_PASS = ctypes.c_int64(2)

View File

@ -4,6 +4,7 @@ import logging
from llvmlite import ir
from pythonbpf.expr import eval_expr, get_base_type_and_depth, deref_to_depth
from pythonbpf.expr.vmlinux_registry import VmlinuxHandlerRegistry
from pythonbpf.helper.helper_utils import get_char_array_ptr_and_size
logger = logging.getLogger(__name__)
@ -219,11 +220,12 @@ def _prepare_expr_args(expr, func, module, builder, local_sym_tab, struct_sym_ta
"""Evaluate and prepare an expression to use as an arg for bpf_printk."""
# Special case: struct field char array needs pointer to first element
char_array_ptr = _get_struct_char_array_ptr(
expr, builder, local_sym_tab, struct_sym_tab
)
if char_array_ptr:
return char_array_ptr
if isinstance(expr, ast.Attribute):
char_array_ptr, _ = get_char_array_ptr_and_size(
expr, builder, local_sym_tab, struct_sym_tab, func
)
if char_array_ptr:
return char_array_ptr
# Regular expression evaluation
val, _ = eval_expr(func, module, builder, expr, local_sym_tab, None, struct_sym_tab)
@ -242,52 +244,6 @@ def _prepare_expr_args(expr, func, module, builder, local_sym_tab, struct_sym_ta
return ir.Constant(ir.IntType(64), 0)
def _get_struct_char_array_ptr(expr, builder, local_sym_tab, struct_sym_tab):
"""Get pointer to first element of char array in struct field, or None."""
if not (isinstance(expr, ast.Attribute) and isinstance(expr.value, ast.Name)):
return None
var_name = expr.value.id
field_name = expr.attr
# Check if it's a valid struct field
if not (
local_sym_tab
and var_name in local_sym_tab
and struct_sym_tab
and local_sym_tab[var_name].metadata in struct_sym_tab
):
return None
struct_type = local_sym_tab[var_name].metadata
struct_info = struct_sym_tab[struct_type]
if field_name not in struct_info.fields:
return None
field_type = struct_info.field_type(field_name)
# Check if it's a char array
is_char_array = (
isinstance(field_type, ir.ArrayType)
and isinstance(field_type.element, ir.IntType)
and field_type.element.width == 8
)
if not is_char_array:
return None
# Get field pointer and GEP to first element: [N x i8]* -> i8*
struct_ptr = local_sym_tab[var_name].var
field_ptr = struct_info.gep(builder, struct_ptr, field_name)
return builder.gep(
field_ptr,
[ir.Constant(ir.IntType(32), 0), ir.Constant(ir.IntType(32), 0)],
inbounds=True,
)
def _handle_pointer_arg(val, func, builder):
"""Convert pointer type for bpf_printk."""
target, depth = get_base_type_and_depth(val.type)

15
pythonbpf/local_symbol.py Normal file
View File

@ -0,0 +1,15 @@
import llvmlite.ir as ir
from dataclasses import dataclass
from typing import Any
@dataclass
class LocalSymbol:
var: ir.AllocaInstr
ir_type: ir.Type
metadata: Any = None
def __iter__(self):
yield self.var
yield self.ir_type
yield self.metadata

View File

@ -1,4 +1,5 @@
from .maps import HashMap, PerfEventArray, RingBuf
from .maps import HashMap, PerfEventArray, RingBuffer
from .maps_pass import maps_proc
from .map_types import BPFMapType
__all__ = ["HashMap", "PerfEventArray", "maps_proc", "RingBuf"]
__all__ = ["HashMap", "PerfEventArray", "maps_proc", "RingBuffer", "BPFMapType"]

View File

@ -1,22 +1,31 @@
import logging
from llvmlite import ir
from pythonbpf.debuginfo import DebugInfoGenerator
from .map_types import BPFMapType
logger: logging.Logger = logging.getLogger(__name__)
def create_map_debug_info(module, map_global, map_name, map_params):
def create_map_debug_info(module, map_global, map_name, map_params, structs_sym_tab):
"""Generate debug info metadata for BPF maps HASH and PERF_EVENT_ARRAY"""
generator = DebugInfoGenerator(module)
logger.info(f"Creating debug info for map {map_name} with params {map_params}")
uint_type = generator.get_uint32_type()
ulong_type = generator.get_uint64_type()
array_type = generator.create_array_type(
uint_type, map_params.get("type", BPFMapType.UNSPEC).value
)
type_ptr = generator.create_pointer_type(array_type, 64)
key_ptr = generator.create_pointer_type(
array_type if "key_size" in map_params else ulong_type, 64
array_type
if "key_size" in map_params
else _get_key_val_dbg_type(map_params.get("key"), generator, structs_sym_tab),
64,
)
value_ptr = generator.create_pointer_type(
array_type if "value_size" in map_params else ulong_type, 64
array_type
if "value_size" in map_params
else _get_key_val_dbg_type(map_params.get("value"), generator, structs_sym_tab),
64,
)
elements_arr = []
@ -64,7 +73,13 @@ def create_map_debug_info(module, map_global, map_name, map_params):
return global_var
def create_ringbuf_debug_info(module, map_global, map_name, map_params):
# TODO: This should not be exposed outside of the module.
# Ideally we should expose a single create_map_debug_info function that handles all map types.
# We can probably use a registry pattern to register different map types and their debug info generators.
# map_params["type"] will be used to determine which generator to use.
def create_ringbuf_debug_info(
module, map_global, map_name, map_params, structs_sym_tab
):
"""Generate debug information metadata for BPF RINGBUF map"""
generator = DebugInfoGenerator(module)
@ -91,3 +106,66 @@ def create_ringbuf_debug_info(module, map_global, map_name, map_params):
)
map_global.set_metadata("dbg", global_var)
return global_var
def _get_key_val_dbg_type(name, generator, structs_sym_tab):
"""Get the debug type for key/value based on type object"""
if not name:
logger.warn("No name provided for key/value type, defaulting to uint64")
return generator.get_uint64_type()
type_obj = structs_sym_tab.get(name)
if type_obj:
logger.info(f"Found struct named {name}, generating debug type")
return _get_struct_debug_type(type_obj, generator, structs_sym_tab)
# Fallback to basic types
logger.info(f"No struct named {name}, falling back to basic type")
# NOTE: Only handling int and long for now
if name in ["c_int32", "c_uint32"]:
return generator.get_uint32_type()
# Default fallback for now
return generator.get_uint64_type()
def _get_struct_debug_type(struct_obj, generator, structs_sym_tab):
"""Recursively create debug type for struct"""
elements_arr = []
for fld in struct_obj.fields.keys():
fld_type = struct_obj.field_type(fld)
if isinstance(fld_type, ir.IntType):
if fld_type.width == 32:
fld_dbg_type = generator.get_uint32_type()
else:
# NOTE: Assuming 64-bit for all other int types
fld_dbg_type = generator.get_uint64_type()
elif isinstance(fld_type, ir.ArrayType):
# NOTE: Array types have u8 elements only for now
# Debug info generation should fail for other types
elem_type = fld_type.element
if isinstance(elem_type, ir.IntType) and elem_type.width == 8:
char_type = generator.get_uint8_type()
fld_dbg_type = generator.create_array_type(char_type, fld_type.count)
else:
logger.warning(
f"Array element type {str(elem_type)} not supported for debug info, skipping"
)
continue
else:
# NOTE: Only handling int and char arrays for now
logger.warning(
f"Field type {str(fld_type)} not supported for debug info, skipping"
)
continue
member = generator.create_struct_member(
fld, fld_dbg_type, struct_obj.field_size(fld)
)
elements_arr.append(member)
struct_type = generator.create_struct_type(
elements_arr, struct_obj.size * 8, is_distinct=True
)
return struct_type

View File

@ -36,11 +36,14 @@ class PerfEventArray:
pass # Placeholder for output method
class RingBuf:
class RingBuffer:
def __init__(self, max_entries):
self.max_entries = max_entries
def reserve(self, size: int, flags=0):
def output(self, data, flags=0):
pass
def reserve(self, size: int):
if size > self.max_entries:
raise ValueError("size cannot be greater than set maximum entries")
return 0
@ -48,4 +51,7 @@ class RingBuf:
def submit(self, data, flags=0):
pass
def discard(self, data, flags=0):
pass
# add discard, output and also give names to flags and stuff

View File

@ -3,7 +3,7 @@ import logging
from logging import Logger
from llvmlite import ir
from .maps_utils import MapProcessorRegistry
from .maps_utils import MapProcessorRegistry, MapSymbol
from .map_types import BPFMapType
from .map_debug_info import create_map_debug_info, create_ringbuf_debug_info
from pythonbpf.expr.vmlinux_registry import VmlinuxHandlerRegistry
@ -12,13 +12,15 @@ from pythonbpf.expr.vmlinux_registry import VmlinuxHandlerRegistry
logger: Logger = logging.getLogger(__name__)
def maps_proc(tree, module, chunks):
def maps_proc(tree, module, chunks, structs_sym_tab):
"""Process all functions decorated with @map to find BPF maps"""
map_sym_tab = {}
for func_node in chunks:
if is_map(func_node):
logger.info(f"Found BPF map: {func_node.name}")
map_sym_tab[func_node.name] = process_bpf_map(func_node, module)
map_sym_tab[func_node.name] = process_bpf_map(
func_node, module, structs_sym_tab
)
return map_sym_tab
@ -46,7 +48,7 @@ def create_bpf_map(module, map_name, map_params):
map_global.align = 8
logger.info(f"Created BPF map: {map_name} with params {map_params}")
return map_global
return MapSymbol(type=map_params["type"], sym=map_global, params=map_params)
def _parse_map_params(rval, expected_args=None):
@ -60,7 +62,8 @@ def _parse_map_params(rval, expected_args=None):
if i < len(rval.args):
arg = rval.args[i]
if isinstance(arg, ast.Name):
params[arg_name] = arg.id
result = _get_vmlinux_enum(handler, arg.id)
params[arg_name] = result if result is not None else arg.id
elif isinstance(arg, ast.Constant):
params[arg_name] = arg.value
@ -68,33 +71,48 @@ def _parse_map_params(rval, expected_args=None):
for keyword in rval.keywords:
if isinstance(keyword.value, ast.Name):
name = keyword.value.id
if handler and handler.is_vmlinux_enum(name):
result = handler.get_vmlinux_enum_value(name)
params[keyword.arg] = result if result is not None else name
else:
params[keyword.arg] = name
result = _get_vmlinux_enum(handler, name)
params[keyword.arg] = result if result is not None else name
elif isinstance(keyword.value, ast.Constant):
params[keyword.arg] = keyword.value.value
return params
@MapProcessorRegistry.register("RingBuf")
def process_ringbuf_map(map_name, rval, module):
def _get_vmlinux_enum(handler, name):
if handler and handler.is_vmlinux_enum(name):
return handler.get_vmlinux_enum_value(name)
@MapProcessorRegistry.register("RingBuffer")
def process_ringbuf_map(map_name, rval, module, structs_sym_tab):
"""Process a BPF_RINGBUF map declaration"""
logger.info(f"Processing Ringbuf: {map_name}")
map_params = _parse_map_params(rval, expected_args=["max_entries"])
map_params["type"] = BPFMapType.RINGBUF
# NOTE: constraints borrowed from https://docs.ebpf.io/linux/map-type/BPF_MAP_TYPE_RINGBUF/
max_entries = map_params.get("max_entries")
if (
not isinstance(max_entries, int)
or max_entries < 4096
or (max_entries & (max_entries - 1)) != 0
):
raise ValueError(
"Ringbuf max_entries must be a power of two greater than or equal to the page size (4096)"
)
logger.info(f"Ringbuf map parameters: {map_params}")
map_global = create_bpf_map(module, map_name, map_params)
create_ringbuf_debug_info(module, map_global, map_name, map_params)
create_ringbuf_debug_info(
module, map_global.sym, map_name, map_params, structs_sym_tab
)
return map_global
@MapProcessorRegistry.register("HashMap")
def process_hash_map(map_name, rval, module):
def process_hash_map(map_name, rval, module, structs_sym_tab):
"""Process a BPF_HASH map declaration"""
logger.info(f"Processing HashMap: {map_name}")
map_params = _parse_map_params(rval, expected_args=["key", "value", "max_entries"])
@ -103,12 +121,12 @@ def process_hash_map(map_name, rval, module):
logger.info(f"Map parameters: {map_params}")
map_global = create_bpf_map(module, map_name, map_params)
# Generate debug info for BTF
create_map_debug_info(module, map_global, map_name, map_params)
create_map_debug_info(module, map_global.sym, map_name, map_params, structs_sym_tab)
return map_global
@MapProcessorRegistry.register("PerfEventArray")
def process_perf_event_map(map_name, rval, module):
def process_perf_event_map(map_name, rval, module, structs_sym_tab):
"""Process a BPF_PERF_EVENT_ARRAY map declaration"""
logger.info(f"Processing PerfEventArray: {map_name}")
map_params = _parse_map_params(rval, expected_args=["key_size", "value_size"])
@ -117,11 +135,11 @@ def process_perf_event_map(map_name, rval, module):
logger.info(f"Map parameters: {map_params}")
map_global = create_bpf_map(module, map_name, map_params)
# Generate debug info for BTF
create_map_debug_info(module, map_global, map_name, map_params)
create_map_debug_info(module, map_global.sym, map_name, map_params, structs_sym_tab)
return map_global
def process_bpf_map(func_node, module):
def process_bpf_map(func_node, module, structs_sym_tab):
"""Process a BPF map (a function decorated with @map)"""
map_name = func_node.name
logger.info(f"Processing BPF map: {map_name}")
@ -140,7 +158,7 @@ def process_bpf_map(func_node, module):
if isinstance(rval, ast.Call) and isinstance(rval.func, ast.Name):
handler = MapProcessorRegistry.get_processor(rval.func.id)
if handler:
return handler(map_name, rval, module)
return handler(map_name, rval, module, structs_sym_tab)
else:
logger.warning(f"Unknown map type {rval.func.id}, defaulting to HashMap")
return process_hash_map(map_name, rval, module)

View File

@ -1,5 +1,17 @@
from collections.abc import Callable
from dataclasses import dataclass
from llvmlite import ir
from typing import Any
from .map_types import BPFMapType
@dataclass
class MapSymbol:
"""Class representing a symbol on the map"""
type: BPFMapType
sym: ir.GlobalVariable
params: dict[str, Any] | None = None
class MapProcessorRegistry:

View File

@ -13,6 +13,15 @@ mapping = {
"c_float": ir.FloatType(),
"c_double": ir.DoubleType(),
"c_void_p": ir.IntType(64),
"c_long": ir.IntType(64),
"c_ulong": ir.IntType(64),
"c_longlong": ir.IntType(64),
"c_uint": ir.IntType(32),
"c_int": ir.IntType(32),
"c_ushort": ir.IntType(16),
"c_short": ir.IntType(16),
"c_ubyte": ir.IntType(8),
"c_byte": ir.IntType(8),
# Not so sure about this one
"str": ir.PointerType(ir.IntType(8)),
}

View File

@ -1,12 +1,11 @@
from enum import Enum, auto
from typing import Any, Dict, List, Optional, TypedDict
from typing import Any, Dict, List, Optional
from dataclasses import dataclass
import llvmlite.ir as ir
from pythonbpf.vmlinux_parser.dependency_node import Field
@dataclass
class AssignmentType(Enum):
CONSTANT = auto()
STRUCT = auto()
@ -16,7 +15,7 @@ class AssignmentType(Enum):
@dataclass
class FunctionSignature(TypedDict):
class FunctionSignature:
return_type: str
param_types: List[str]
varargs: bool
@ -24,7 +23,7 @@ class FunctionSignature(TypedDict):
# Thew name of the assignment will be in the dict that uses this class
@dataclass
class AssignmentInfo(TypedDict):
class AssignmentInfo:
value_type: AssignmentType
python_type: type
value: Optional[Any]
@ -34,3 +33,4 @@ class AssignmentInfo(TypedDict):
# Value is a tuple that contains the global variable representing that field
# along with all the information about that field as a Field type.
members: Optional[Dict[str, tuple[ir.GlobalVariable, Field]]] # For structs.
debug_info: Any

View File

@ -16,6 +16,33 @@ def get_module_symbols(module_name: str):
return [name for name in dir(imported_module)], imported_module
def unwrap_pointer_type(type_obj: Any) -> Any:
"""
Recursively unwrap all pointer layers to get the base type.
This handles multiply nested pointers like LP_LP_struct_attribute_group
and returns the base type (struct_attribute_group).
Stops unwrapping when reaching a non-pointer type (one without _type_ attribute).
Args:
type_obj: The type object to unwrap
Returns:
The base type after unwrapping all pointer layers
"""
current_type = type_obj
# Keep unwrapping while it's a pointer/array type (has _type_)
# But stop if _type_ is just a string or basic type marker
while hasattr(current_type, "_type_"):
next_type = current_type._type_
# Stop if _type_ is a string (like 'c' for c_char)
if isinstance(next_type, str):
break
current_type = next_type
return current_type
def process_vmlinux_class(
node,
llvm_module,
@ -158,13 +185,90 @@ def process_vmlinux_post_ast(
if hasattr(elem_type, "_length_") and is_complex_type:
type_length = elem_type._length_
if containing_type.__module__ == "vmlinux":
new_dep_node.add_dependent(
elem_type._type_.__name__
if hasattr(elem_type._type_, "__name__")
else str(elem_type._type_)
# Unwrap all pointer layers to get the base type for dependency tracking
base_type = unwrap_pointer_type(elem_type)
base_type_module = getattr(base_type, "__module__", None)
if base_type_module == "vmlinux":
base_type_name = (
base_type.__name__
if hasattr(base_type, "__name__")
else str(base_type)
)
# ONLY add vmlinux types as dependencies
new_dep_node.add_dependent(base_type_name)
logger.debug(
f"{containing_type} containing type of parent {elem_name} with {elem_type} and ctype {ctype_complex_type} and length {type_length}"
)
new_dep_node.set_field_containing_type(
elem_name, containing_type
)
new_dep_node.set_field_type_size(elem_name, type_length)
new_dep_node.set_field_ctype_complex_type(
elem_name, ctype_complex_type
)
new_dep_node.set_field_type(elem_name, elem_type)
# Check the containing_type module to decide whether to recurse
containing_type_module = getattr(
containing_type, "__module__", None
)
if containing_type_module == "vmlinux":
# Also unwrap containing_type to get base type name
base_containing_type = unwrap_pointer_type(
containing_type
)
containing_type_name = (
base_containing_type.__name__
if hasattr(base_containing_type, "__name__")
else str(base_containing_type)
)
# Check for self-reference or already processed
if containing_type_name == current_symbol_name:
# Self-referential pointer
logger.debug(
f"Self-referential pointer in {current_symbol_name}.{elem_name}"
)
new_dep_node.set_field_ready(elem_name, True)
elif handler.has_node(containing_type_name):
# Already processed
logger.debug(
f"Reusing already processed {containing_type_name}"
)
new_dep_node.set_field_ready(elem_name, True)
else:
# Process recursively - use base containing type, not the pointer wrapper
new_dep_node.add_dependent(containing_type_name)
process_vmlinux_post_ast(
base_containing_type,
llvm_handler,
handler,
processing_stack,
)
new_dep_node.set_field_ready(elem_name, True)
elif (
containing_type_module == ctypes.__name__
or containing_type_module is None
):
logger.debug(
f"Processing ctype internal{containing_type}"
)
new_dep_node.set_field_ready(elem_name, True)
else:
raise TypeError(
f"Module not supported in recursive resolution: {containing_type_module}"
)
elif (
base_type_module == ctypes.__name__
or base_type_module is None
):
# Handle ctypes or types with no module (like some internal ctypes types)
# DO NOT add ctypes as dependencies - just set field metadata and mark ready
logger.debug(
f"Base type {base_type} is ctypes - NOT adding as dependency, just processing field"
)
elif containing_type.__module__ == ctypes.__name__:
if isinstance(elem_type, type):
if issubclass(elem_type, ctypes.Array):
ctype_complex_type = ctypes.Array
@ -176,57 +280,20 @@ def process_vmlinux_post_ast(
)
else:
raise TypeError("Unsupported ctypes subclass")
else:
raise ImportError(
f"Unsupported module of {containing_type}"
)
logger.debug(
f"{containing_type} containing type of parent {elem_name} with {elem_type} and ctype {ctype_complex_type} and length {type_length}"
)
new_dep_node.set_field_containing_type(
elem_name, containing_type
)
new_dep_node.set_field_type_size(elem_name, type_length)
new_dep_node.set_field_ctype_complex_type(
elem_name, ctype_complex_type
)
new_dep_node.set_field_type(elem_name, elem_type)
if containing_type.__module__ == "vmlinux":
containing_type_name = (
containing_type.__name__
if hasattr(containing_type, "__name__")
else str(containing_type)
)
# Check for self-reference or already processed
if containing_type_name == current_symbol_name:
# Self-referential pointer
logger.debug(
f"Self-referential pointer in {current_symbol_name}.{elem_name}"
)
new_dep_node.set_field_ready(elem_name, True)
elif handler.has_node(containing_type_name):
# Already processed
logger.debug(
f"Reusing already processed {containing_type_name}"
)
new_dep_node.set_field_ready(elem_name, True)
else:
# Process recursively - THIS WAS MISSING
new_dep_node.add_dependent(containing_type_name)
process_vmlinux_post_ast(
containing_type,
llvm_handler,
handler,
processing_stack,
)
new_dep_node.set_field_ready(elem_name, True)
elif containing_type.__module__ == ctypes.__name__:
logger.debug(f"Processing ctype internal{containing_type}")
# Set field metadata but DO NOT add dependency or recurse
new_dep_node.set_field_containing_type(
elem_name, containing_type
)
new_dep_node.set_field_type_size(elem_name, type_length)
new_dep_node.set_field_ctype_complex_type(
elem_name, ctype_complex_type
)
new_dep_node.set_field_type(elem_name, elem_type)
new_dep_node.set_field_ready(elem_name, True)
else:
raise TypeError(
"Module not supported in recursive resolution"
raise ImportError(
f"Unsupported module of {base_type}: {base_type_module}"
)
else:
new_dep_node.add_dependent(
@ -245,9 +312,12 @@ def process_vmlinux_post_ast(
raise ValueError(
f"{elem_name} with type {elem_type} from module {module_name} not supported in recursive resolver"
)
elif module_name == ctypes.__name__ or module_name is None:
# Handle ctypes types - these don't need processing, just return
logger.debug(f"Skipping ctypes type {current_symbol_name}")
return True
else:
raise ImportError("UNSUPPORTED Module")
raise ImportError(f"UNSUPPORTED Module {module_name}")
logger.info(
f"{current_symbol_name} processed and handler readiness {handler.is_ready}"

View File

@ -11,7 +11,9 @@ from .class_handler import process_vmlinux_class
logger = logging.getLogger(__name__)
def detect_import_statement(tree: ast.AST) -> list[tuple[str, ast.ImportFrom]]:
def detect_import_statement(
tree: ast.AST,
) -> list[tuple[str, ast.ImportFrom, str, str]]:
"""
Parse AST and detect import statements from vmlinux.
@ -25,7 +27,7 @@ def detect_import_statement(tree: ast.AST) -> list[tuple[str, ast.ImportFrom]]:
List of tuples containing (module_name, imported_item) for each vmlinux import
Raises:
SyntaxError: If multiple imports from vmlinux are attempted or import * is used
SyntaxError: If import * is used
"""
vmlinux_imports = []
@ -40,28 +42,19 @@ def detect_import_statement(tree: ast.AST) -> list[tuple[str, ast.ImportFrom]]:
"Please import specific types explicitly."
)
# Check for multiple imports: from vmlinux import A, B, C
if len(node.names) > 1:
imported_names = [alias.name for alias in node.names]
raise SyntaxError(
f"Multiple imports from vmlinux are not supported. "
f"Found: {', '.join(imported_names)}. "
f"Please use separate import statements for each type."
)
# Check if no specific import is specified (should not happen with valid Python)
if len(node.names) == 0:
raise SyntaxError(
"Import from vmlinux must specify at least one type."
)
# Valid single import
# Support multiple imports: from vmlinux import A, B, C
for alias in node.names:
import_name = alias.name
# Use alias if provided, otherwise use the original name (commented)
# as_name = alias.asname if alias.asname else alias.name
vmlinux_imports.append(("vmlinux", node))
logger.info(f"Found vmlinux import: {import_name}")
# Use alias if provided, otherwise use the original name
as_name = alias.asname if alias.asname else alias.name
vmlinux_imports.append(("vmlinux", node, import_name, as_name))
logger.info(f"Found vmlinux import: {import_name} as {as_name}")
# Handle "import vmlinux" statements (not typical but should be rejected)
elif isinstance(node, ast.Import):
@ -86,57 +79,54 @@ def vmlinux_proc(tree: ast.AST, module):
if not import_statements:
logger.info("No vmlinux imports found")
return
return None
# Import vmlinux module directly
try:
vmlinux_mod = importlib.import_module("vmlinux")
except ImportError:
logger.warning("Could not import vmlinux module")
return
return None
source_file = inspect.getsourcefile(vmlinux_mod)
if source_file is None:
logger.warning("Cannot find source for vmlinux module")
return
return None
with open(source_file, "r") as f:
mod_ast = ast.parse(f.read(), filename=source_file)
for import_mod, import_node in import_statements:
for alias in import_node.names:
imported_name = alias.name
found = False
for mod_node in mod_ast.body:
if (
isinstance(mod_node, ast.ClassDef)
and mod_node.name == imported_name
):
process_vmlinux_class(mod_node, module, handler)
found = True
break
if isinstance(mod_node, ast.Assign):
for target in mod_node.targets:
if isinstance(target, ast.Name) and target.id == imported_name:
process_vmlinux_assign(mod_node, module, assignments)
found = True
break
if found:
break
if not found:
logger.info(
f"{imported_name} not found as ClassDef or Assign in vmlinux"
)
for import_mod, import_node, imported_name, as_name in import_statements:
found = False
for mod_node in mod_ast.body:
if isinstance(mod_node, ast.ClassDef) and mod_node.name == imported_name:
process_vmlinux_class(mod_node, module, handler)
found = True
break
if isinstance(mod_node, ast.Assign):
for target in mod_node.targets:
if isinstance(target, ast.Name) and target.id == imported_name:
process_vmlinux_assign(mod_node, module, assignments, as_name)
found = True
break
if found:
break
if not found:
logger.info(f"{imported_name} not found as ClassDef or Assign in vmlinux")
IRGenerator(module, handler, assignments)
return assignments
def process_vmlinux_assign(node, module, assignments: dict[str, AssignmentInfo]):
def process_vmlinux_assign(
node, module, assignments: dict[str, AssignmentInfo], target_name=None
):
"""Process assignments from vmlinux module."""
# Only handle single-target assignments
if len(node.targets) == 1 and isinstance(node.targets[0], ast.Name):
target_name = node.targets[0].id
# Use provided target_name (for aliased imports) or fall back to original name
if target_name is None:
target_name = node.targets[0].id
# Handle constant value assignments
if isinstance(node.value, ast.Constant):
@ -148,6 +138,7 @@ def process_vmlinux_assign(node, module, assignments: dict[str, AssignmentInfo])
pointer_level=None,
signature=None,
members=None,
debug_info=None,
)
logger.info(
f"Added assignment: {target_name} = {node.value.value!r} of type {type(node.value.value)}"

View File

@ -21,7 +21,7 @@ def debug_info_generation(
generated_debug_info: List of tuples (struct, debug_info) to track generated debug info
Returns:
The generated global variable debug info
The generated global variable debug info, or None for unsupported types
"""
# Set up debug info generator
generator = DebugInfoGenerator(llvm_module)
@ -31,23 +31,42 @@ def debug_info_generation(
if existing_struct.name == struct.name:
return debug_info
# Check if this is a union (not supported yet)
if not struct.name.startswith("struct_"):
logger.warning(f"Skipping debug info generation for union: {struct.name}")
# Create a minimal forward declaration for unions
union_type = generator.create_struct_type(
[], struct.__sizeof__() * 8, is_distinct=True
)
return union_type
# Process all fields and create members for the struct
members = []
for field_name, field in struct.fields.items():
# Get appropriate debug type for this field
field_type = _get_field_debug_type(
field_name, field, generator, struct, generated_debug_info
)
# Create struct member with proper offset
member = generator.create_struct_member_vmlinux(
field_name, field_type, field.offset * 8
)
members.append(member)
if struct.name.startswith("struct_"):
struct_name = struct.name.removeprefix("struct_")
else:
raise ValueError("Unions are not supported in the current version")
sorted_fields = sorted(struct.fields.items(), key=lambda item: item[1].offset)
for field_name, field in sorted_fields:
try:
# Get appropriate debug type for this field
field_type = _get_field_debug_type(
field_name, field, generator, struct, generated_debug_info
)
# Ensure field_type is a tuple
if not isinstance(field_type, tuple) or len(field_type) != 2:
logger.error(f"Invalid field_type for {field_name}: {field_type}")
continue
# Create struct member with proper offset
member = generator.create_struct_member_vmlinux(
field_name, field_type, field.offset * 8
)
members.append(member)
except Exception as e:
logger.error(f"Failed to process field {field_name} in {struct.name}: {e}")
continue
struct_name = struct.name.removeprefix("struct_")
# Create struct type with all members
struct_type = generator.create_struct_type_with_name(
struct_name, members, struct.__sizeof__() * 8, is_distinct=True
@ -74,11 +93,19 @@ def _get_field_debug_type(
generated_debug_info: List of already generated debug info
Returns:
The debug info type for this field
A tuple of (debug_type, size_in_bits)
"""
# Handle complex types (arrays, pointers)
# Handle complex types (arrays, pointers, function pointers)
if field.ctype_complex_type is not None:
if issubclass(field.ctype_complex_type, ctypes.Array):
# Handle function pointer types (CFUNCTYPE)
if callable(field.ctype_complex_type):
# Function pointers are represented as void pointers
logger.warning(
f"Field {field_name} is a function pointer, using void pointer"
)
void_ptr = generator.create_pointer_type(None, 64)
return void_ptr, 64
elif issubclass(field.ctype_complex_type, ctypes.Array):
# Handle array types
element_type, base_type_size = _get_basic_debug_type(
field.containing_type, generator
@ -100,11 +127,13 @@ def _get_field_debug_type(
for existing_struct, debug_info in generated_debug_info:
if existing_struct.name == struct_name:
# Use existing debug info
return debug_info, existing_struct.__sizeof__()
return debug_info, existing_struct.__sizeof__() * 8
# If not found, create a forward declaration
# This will be completed when the actual struct is processed
logger.warning("Forward declaration in struct created")
logger.info(
f"Forward declaration created for {struct_name} in {parent_struct.name}"
)
forward_type = generator.create_struct_type([], 0, is_distinct=True)
return forward_type, 0

View File

@ -11,6 +11,10 @@ logger = logging.getLogger(__name__)
class IRGenerator:
# This field keeps track of the non_struct names to avoid duplicate name errors.
type_number = 0
unprocessed_store: list[str] = []
# get the assignments dict and add this stuff to it.
def __init__(self, llvm_module, handler: DependencyHandler, assignments):
self.llvm_module = llvm_module
@ -73,9 +77,8 @@ class IRGenerator:
)
# Generate IR first to populate field names
self.generated_debug_info.append(
(struct, self.gen_ir(struct, self.generated_debug_info))
)
struct_debug_info = self.gen_ir(struct, self.generated_debug_info)
self.generated_debug_info.append((struct, struct_debug_info))
# Fill the assignments dictionary with struct information
if struct.name not in self.assignments:
@ -105,6 +108,7 @@ class IRGenerator:
pointer_level=None,
signature=None,
members=members_dict,
debug_info=struct_debug_info,
)
logger.info(f"Added struct assignment info for {struct.name}")
@ -129,7 +133,19 @@ class IRGenerator:
for field_name, field in struct.fields.items():
# does not take arrays and similar types into consideration yet.
if field.ctype_complex_type is not None and issubclass(
if callable(field.ctype_complex_type):
# Function pointer case - generate a simple field accessor
field_co_re_name, returned = self._struct_name_generator(
struct, field, field_index
)
field_index += 1
globvar = ir.GlobalVariable(
self.llvm_module, ir.IntType(64), name=field_co_re_name
)
globvar.linkage = "external"
globvar.set_metadata("llvm.preserve.access.index", debug_info)
self.generated_field_names[struct.name][field_name] = globvar
elif field.ctype_complex_type is not None and issubclass(
field.ctype_complex_type, ctypes.Array
):
array_size = field.type_size
@ -137,7 +153,7 @@ class IRGenerator:
if containing_type.__module__ == ctypes.__name__:
containing_type_size = ctypes.sizeof(containing_type)
if array_size == 0:
field_co_re_name = self._struct_name_generator(
field_co_re_name, returned = self._struct_name_generator(
struct, field, field_index, True, 0, containing_type_size
)
globvar = ir.GlobalVariable(
@ -149,7 +165,7 @@ class IRGenerator:
field_index += 1
continue
for i in range(0, array_size):
field_co_re_name = self._struct_name_generator(
field_co_re_name, returned = self._struct_name_generator(
struct, field, field_index, True, i, containing_type_size
)
globvar = ir.GlobalVariable(
@ -163,12 +179,28 @@ class IRGenerator:
array_size = field.type_size
containing_type = field.containing_type
if containing_type.__module__ == "vmlinux":
containing_type_size = self.handler[
containing_type.__name__
].current_offset
for i in range(0, array_size):
field_co_re_name = self._struct_name_generator(
struct, field, field_index, True, i, containing_type_size
# Unwrap all pointer layers to get the base struct type
base_containing_type = containing_type
while hasattr(base_containing_type, "_type_"):
next_type = base_containing_type._type_
# Stop if _type_ is a string (like 'c' for c_char)
# TODO: stacked pointers not handl;ing ctypes check here as well
if isinstance(next_type, str):
break
base_containing_type = next_type
# Get the base struct name
base_struct_name = (
base_containing_type.__name__
if hasattr(base_containing_type, "__name__")
else str(base_containing_type)
)
# Look up the size using the base struct name
containing_type_size = self.handler[base_struct_name].current_offset
if array_size == 0:
field_co_re_name, returned = self._struct_name_generator(
struct, field, field_index, True, 0, containing_type_size
)
globvar = ir.GlobalVariable(
self.llvm_module, ir.IntType(64), name=field_co_re_name
@ -176,9 +208,30 @@ class IRGenerator:
globvar.linkage = "external"
globvar.set_metadata("llvm.preserve.access.index", debug_info)
self.generated_field_names[struct.name][field_name] = globvar
field_index += 1
field_index += 1
else:
for i in range(0, array_size):
field_co_re_name, returned = self._struct_name_generator(
struct,
field,
field_index,
True,
i,
containing_type_size,
)
globvar = ir.GlobalVariable(
self.llvm_module, ir.IntType(64), name=field_co_re_name
)
globvar.linkage = "external"
globvar.set_metadata(
"llvm.preserve.access.index", debug_info
)
self.generated_field_names[struct.name][field_name] = (
globvar
)
field_index += 1
else:
field_co_re_name = self._struct_name_generator(
field_co_re_name, returned = self._struct_name_generator(
struct, field, field_index
)
field_index += 1
@ -198,7 +251,7 @@ class IRGenerator:
is_indexed: bool = False,
index: int = 0,
containing_type_size: int = 0,
) -> str:
) -> tuple[str, bool]:
# TODO: Does not support Unions as well as recursive pointer and array type naming
if is_indexed:
name = (
@ -208,7 +261,7 @@ class IRGenerator:
+ "$"
+ f"0:{field_index}:{index}"
)
return name
return name, True
elif struct.name.startswith("struct_"):
name = (
"llvm."
@ -217,9 +270,18 @@ class IRGenerator:
+ "$"
+ f"0:{field_index}"
)
return name
return name, True
else:
print(self.handler[struct.name])
raise TypeError(
"Name generation cannot occur due to type name not starting with struct"
logger.warning(
"Blindly handling non-struct type to avoid type errors in vmlinux IR generation. Possibly a union."
)
self.type_number += 1
unprocessed_type = "unprocessed_type_" + str(self.handler[struct.name].name)
if self.unprocessed_store.__contains__(unprocessed_type):
return unprocessed_type + "_" + str(self.type_number), False
else:
self.unprocessed_store.append(unprocessed_type)
return unprocessed_type, False
# raise TypeError(
# "Name generation cannot occur due to type name not starting with struct"
# )

View File

@ -1,6 +1,9 @@
import logging
from typing import Any
import ctypes
from llvmlite import ir
from pythonbpf.local_symbol import LocalSymbol
from pythonbpf.vmlinux_parser.assignment_info import AssignmentType
logger = logging.getLogger(__name__)
@ -36,55 +39,368 @@ class VmlinuxHandler:
"""Check if name is a vmlinux enum constant"""
return (
name in self.vmlinux_symtab
and self.vmlinux_symtab[name]["value_type"] == AssignmentType.CONSTANT
and self.vmlinux_symtab[name].value_type == AssignmentType.CONSTANT
)
def get_struct_debug_info(self, name: str) -> Any:
if (
name in self.vmlinux_symtab
and self.vmlinux_symtab[name].value_type == AssignmentType.STRUCT
):
return self.vmlinux_symtab[name].debug_info
else:
raise ValueError(f"{name} is not a vmlinux struct type")
def get_vmlinux_struct_type(self, name):
"""Check if name is a vmlinux struct type"""
if (
name in self.vmlinux_symtab
and self.vmlinux_symtab[name].value_type == AssignmentType.STRUCT
):
return self.vmlinux_symtab[name].python_type
else:
raise ValueError(f"{name} is not a vmlinux struct type")
def is_vmlinux_struct(self, name):
"""Check if name is a vmlinux struct"""
return (
name in self.vmlinux_symtab
and self.vmlinux_symtab[name]["value_type"] == AssignmentType.STRUCT
and self.vmlinux_symtab[name].value_type == AssignmentType.STRUCT
)
def handle_vmlinux_enum(self, name):
"""Handle vmlinux enum constants by returning LLVM IR constants"""
if self.is_vmlinux_enum(name):
value = self.vmlinux_symtab[name]["value"]
value = self.vmlinux_symtab[name].value
logger.info(f"Resolving vmlinux enum {name} = {value}")
return ir.Constant(ir.IntType(64), value), ir.IntType(64)
return None
def get_vmlinux_enum_value(self, name):
"""Handle vmlinux enum constants by returning LLVM IR constants"""
"""Handle vmlinux.enum constants by returning LLVM IR constants"""
if self.is_vmlinux_enum(name):
value = self.vmlinux_symtab[name]["value"]
value = self.vmlinux_symtab[name].value
logger.info(f"The value of vmlinux enum {name} = {value}")
return value
return None
def handle_vmlinux_struct(self, struct_name, module, builder):
"""Handle vmlinux struct initializations"""
if self.is_vmlinux_struct(struct_name):
# TODO: Implement core-specific struct handling
# This will be more complex and depends on the BTF information
logger.info(f"Handling vmlinux struct {struct_name}")
# Return struct type and allocated pointer
# This is a stub, actual implementation will be more complex
return None
return None
def handle_vmlinux_struct_field(
self, struct_var_name, field_name, module, builder, local_sym_tab
):
"""Handle access to vmlinux struct fields"""
# Check if it's a variable of vmlinux struct type
if struct_var_name in local_sym_tab:
var_info = local_sym_tab[struct_var_name] # noqa: F841
# Need to check if this variable is a vmlinux struct
# This will depend on how you track vmlinux struct types in your symbol table
var_info: LocalSymbol = local_sym_tab[struct_var_name]
logger.info(
f"Attempting to access field {field_name} of possible vmlinux struct {struct_var_name}"
)
# Return pointer to field and field type
return None
return None
python_type: type = var_info.metadata
# Check if this is a context field (ctx) or a cast struct
is_context_field = var_info.var is None
if is_context_field:
# Handle context field access (original behavior)
struct_name = python_type.__name__
globvar_ir, field_data = self.get_field_type(struct_name, field_name)
builder.function.args[0].type = ir.PointerType(ir.IntType(8))
field_ptr = self.load_ctx_field(
builder,
builder.function.args[0],
globvar_ir,
field_data,
struct_name,
)
return field_ptr, field_data
else:
# Handle cast struct field access
struct_name = python_type.__name__
globvar_ir, field_data = self.get_field_type(struct_name, field_name)
# Handle cast struct field access (use bpf_probe_read_kernel)
# Load the struct pointer from the local variable
struct_ptr = builder.load(var_info.var)
# Determine the preallocated tmp name that assignment pass should have created
tmp_name = f"{struct_var_name}_{field_name}_tmp"
# Use bpf_probe_read_kernel for non-context struct field access
field_value = self.load_struct_field(
builder,
struct_ptr,
globvar_ir,
field_data,
struct_name,
local_sym_tab,
tmp_name,
)
# Return field value and field type
return field_value, field_data
else:
raise RuntimeError("Variable accessed not found in symbol table")
@staticmethod
def load_struct_field(
builder,
struct_ptr_int,
offset_global,
field_data,
struct_name=None,
local_sym_tab=None,
tmp_name: str | None = None,
):
"""
Generate LLVM IR to load a field from a regular (non-context) struct using bpf_probe_read_kernel.
Args:
builder: llvmlite IRBuilder instance
struct_ptr_int: The struct pointer as an i64 value (already loaded from alloca)
offset_global: Global variable containing the field offset (i64)
field_data: contains data about the field
struct_name: Name of the struct being accessed (optional)
local_sym_tab: symbol table (optional) - used to locate preallocated tmp storage
tmp_name: name of the preallocated temporary storage to use (preferred)
Returns:
The loaded value
"""
# Load the offset value
offset = builder.load(offset_global)
# Convert i64 to pointer type (BPF stores pointers as i64)
i8_ptr_type = ir.PointerType(ir.IntType(8))
struct_ptr = builder.inttoptr(struct_ptr_int, i8_ptr_type)
# GEP with offset to get field pointer
field_ptr = builder.gep(
struct_ptr,
[offset],
inbounds=False,
)
# Determine the appropriate field size based on field information
field_size_bytes = 8 # Default to 8 bytes (64-bit)
int_width = 64 # Default to 64-bit
needs_zext = False
if field_data is not None:
# Try to determine the size from field metadata
if field_data.type.__module__ == ctypes.__name__:
try:
field_size_bytes = ctypes.sizeof(field_data.type)
field_size_bits = field_size_bytes * 8
if field_size_bits in [8, 16, 32, 64]:
int_width = field_size_bits
logger.info(
f"Determined field size: {int_width} bits ({field_size_bytes} bytes)"
)
# Special handling for struct_xdp_md i32 fields
if struct_name == "struct_xdp_md" and int_width == 32:
needs_zext = True
logger.info(
"struct_xdp_md i32 field detected, will zero-extend to i64"
)
else:
logger.warning(
f"Unusual field size {field_size_bits} bits, using default 64"
)
except Exception as e:
logger.warning(
f"Could not determine field size: {e}, using default 64"
)
elif field_data.type.__module__ == "vmlinux":
# For pointers to structs or complex vmlinux types
if field_data.ctype_complex_type is not None and issubclass(
field_data.ctype_complex_type, ctypes._Pointer
):
int_width = 64 # Pointers are always 64-bit
field_size_bytes = 8
logger.info("Field is a pointer type, using 64 bits")
else:
logger.warning("Complex vmlinux field type, using default 64 bits")
# Use preallocated temporary storage if provided by allocation pass
local_storage_i8_ptr = None
if tmp_name and local_sym_tab and tmp_name in local_sym_tab:
# Expect the tmp to be an alloca created during allocation pass
tmp_alloca = local_sym_tab[tmp_name].var
local_storage_i8_ptr = builder.bitcast(tmp_alloca, i8_ptr_type)
else:
# Fallback: allocate inline (not ideal, but preserves behavior)
local_storage = builder.alloca(ir.IntType(int_width))
local_storage_i8_ptr = builder.bitcast(local_storage, i8_ptr_type)
logger.warning(f"Temp storage '{tmp_name}' not found. Allocating inline")
# Use bpf_probe_read_kernel to safely read the field
# This generates:
# %gep = getelementptr i8, ptr %struct_ptr, i64 %offset (already done above as field_ptr)
# %passed = tail call ptr @llvm.bpf.passthrough.p0.p0(i32 2, ptr %gep)
# %result = call i64 inttoptr (i64 113 to ptr)(ptr %local_storage, i32 %size, ptr %passed)
from pythonbpf.helper import emit_probe_read_kernel_call
emit_probe_read_kernel_call(
builder, local_storage_i8_ptr, field_size_bytes, field_ptr
)
# Load the value from local storage
value = builder.load(
builder.bitcast(local_storage_i8_ptr, ir.PointerType(ir.IntType(int_width)))
)
# Zero-extend i32 to i64 if needed
if needs_zext:
value = builder.zext(value, ir.IntType(64))
logger.info("Zero-extended i32 value to i64")
return value
@staticmethod
def load_ctx_field(builder, ctx_arg, offset_global, field_data, struct_name=None):
"""
Generate LLVM IR to load a field from BPF context using offset.
Args:
builder: llvmlite IRBuilder instance
ctx_arg: The context pointer argument (ptr/i8*)
offset_global: Global variable containing the field offset (i64)
field_data: contains data about the field
struct_name: Name of the struct being accessed (optional)
Returns:
The loaded value (i64 register or appropriately sized)
"""
# Load the offset value
offset = builder.load(offset_global)
# Ensure ctx_arg is treated as i8* (byte pointer)
i8_ptr_type = ir.PointerType()
# Cast ctx_arg to i8* if it isn't already
if str(ctx_arg.type) != str(i8_ptr_type):
ctx_i8_ptr = builder.bitcast(ctx_arg, i8_ptr_type)
else:
ctx_i8_ptr = ctx_arg
# GEP with explicit type - this is the key fix
field_ptr = builder.gep(
ctx_i8_ptr,
[offset],
inbounds=False,
)
# Get or declare the BPF passthrough intrinsic
module = builder.function.module
try:
passthrough_fn = module.globals.get("llvm.bpf.passthrough.p0.p0")
if passthrough_fn is None:
raise KeyError
except (KeyError, AttributeError):
passthrough_type = ir.FunctionType(
i8_ptr_type,
[ir.IntType(32), i8_ptr_type],
)
passthrough_fn = ir.Function(
module,
passthrough_type,
name="llvm.bpf.passthrough.p0.p0",
)
# Call passthrough to satisfy BPF verifier
verified_ptr = builder.call(
passthrough_fn, [ir.Constant(ir.IntType(32), 0), field_ptr], tail=True
)
# Determine the appropriate IR type based on field information
int_width = 64 # Default to 64-bit
needs_zext = False # Track if we need zero-extension for xdp_md
if field_data is not None:
# Try to determine the size from field metadata
if field_data.type.__module__ == ctypes.__name__:
try:
field_size_bytes = ctypes.sizeof(field_data.type)
field_size_bits = field_size_bytes * 8
if field_size_bits in [8, 16, 32, 64]:
int_width = field_size_bits
logger.info(f"Determined field size: {int_width} bits")
# Special handling for struct_xdp_md i32 fields
# Load as i32 but extend to i64 before storing
if struct_name == "struct_xdp_md" and int_width == 32:
needs_zext = True
logger.info(
"struct_xdp_md i32 field detected, will zero-extend to i64"
)
else:
logger.warning(
f"Unusual field size {field_size_bits} bits, using default 64"
)
except Exception as e:
logger.warning(
f"Could not determine field size: {e}, using default 64"
)
elif field_data.type.__module__ == "vmlinux":
# For pointers to structs or complex vmlinux types
if field_data.ctype_complex_type is not None and issubclass(
field_data.ctype_complex_type, ctypes._Pointer
):
int_width = 64 # Pointers are always 64-bit
logger.info("Field is a pointer type, using 64 bits")
# TODO: Add handling for other complex types (arrays, embedded structs, etc.)
else:
logger.warning("Complex vmlinux field type, using default 64 bits")
# Bitcast to appropriate pointer type based on determined width
ptr_type = ir.PointerType(ir.IntType(int_width))
typed_ptr = builder.bitcast(verified_ptr, ptr_type)
# Load and return the value
value = builder.load(typed_ptr)
# Zero-extend i32 to i64 for struct_xdp_md fields
if needs_zext:
value = builder.zext(value, ir.IntType(64))
logger.info("Zero-extended i32 value to i64 for struct_xdp_md field")
return value
def has_field(self, struct_name, field_name):
"""Check if a vmlinux struct has a specific field"""
if self.is_vmlinux_struct(struct_name):
python_type = self.vmlinux_symtab[struct_name].python_type
return hasattr(python_type, field_name)
return False
def get_field_type(self, vmlinux_struct_name, field_name):
"""Get the type of a field in a vmlinux struct"""
if self.is_vmlinux_struct(vmlinux_struct_name):
python_type = self.vmlinux_symtab[vmlinux_struct_name].python_type
if hasattr(python_type, field_name):
return self.vmlinux_symtab[vmlinux_struct_name].members[field_name]
else:
raise ValueError(
f"Field {field_name} not found in vmlinux struct {vmlinux_struct_name}"
)
else:
raise ValueError(f"{vmlinux_struct_name} is not a vmlinux struct")
def get_field_index(self, vmlinux_struct_name, field_name):
"""Get the type of a field in a vmlinux struct"""
if self.is_vmlinux_struct(vmlinux_struct_name):
python_type = self.vmlinux_symtab[vmlinux_struct_name].python_type
if hasattr(python_type, field_name):
return list(
self.vmlinux_symtab[vmlinux_struct_name].members.keys()
).index(field_name)
else:
raise ValueError(
f"Field {field_name} not found in vmlinux struct {vmlinux_struct_name}"
)
else:
raise ValueError(f"{vmlinux_struct_name} is not a vmlinux struct")

View File

@ -1,19 +1,22 @@
BPF_CLANG := clang
CFLAGS := -O2 -emit-llvm -target bpf -c
CFLAGS := -emit-llvm -target bpf -c -D__TARGET_ARCH_x86
SRC := $(wildcard *.bpf.c)
LL := $(SRC:.bpf.c=.bpf.ll)
OBJ := $(SRC:.bpf.c=.bpf.o)
LL0 := $(SRC:.bpf.c=.bpf.o0.ll)
.PHONY: all clean
all: $(LL) $(OBJ)
all: $(LL) $(OBJ) $(LL0)
%.bpf.o: %.bpf.c
$(BPF_CLANG) -O2 -g -target bpf -c $< -o $@
$(BPF_CLANG) -O2 -D__TARGET_ARCH_x86 -g -target bpf -c $< -o $@
%.bpf.ll: %.bpf.c
$(BPF_CLANG) $(CFLAGS) -g -S $< -o $@
$(BPF_CLANG) $(CFLAGS) -O2 -g -S $< -o $@
%.bpf.o0.ll: %.bpf.c
$(BPF_CLANG) $(CFLAGS) -O0 -g -S $< -o $@
clean:
rm -f $(LL) $(OBJ)
rm -f $(LL) $(OBJ) $(LL0)

View File

@ -0,0 +1,66 @@
// disksnoop.bpf.c
// eBPF program (compile with: clang -O2 -g -target bpf -c disksnoop.bpf.c -o disksnoop.bpf.o)
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_core_read.h>
char LICENSE[] SEC("license") = "GPL";
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__type(key, __u64);
__type(value, __u64);
__uint(max_entries, 10240);
} start_map SEC(".maps");
/* kprobe: record start timestamp keyed by request pointer */
SEC("kprobe/blk_mq_start_request")
int trace_start(struct pt_regs *ctx)
{
/* request * is first arg */
__u64 reqp = (__u64)(ctx->di);
__u64 ts = bpf_ktime_get_ns();
bpf_map_update_elem(&start_map, &reqp, &ts, BPF_ANY);
// /* optional debug:
bpf_printk("start: req=%llu ts=%llu\n", reqp, ts);
// */
return 0;
}
/* completion: compute latency and print data_len, cmd_flags, latency_us */
SEC("kprobe/blk_mq_end_request")
int trace_completion(struct pt_regs *ctx)
{
__u64 reqp = (__u64)(ctx->di);
__u64 *tsp;
__u64 now_ns;
__u64 delta_ns;
__u64 delta_us = 0;
bpf_printk("%lld", reqp);
tsp = bpf_map_lookup_elem(&start_map, &reqp);
if (!tsp)
return 0;
now_ns = bpf_ktime_get_ns();
delta_ns = now_ns - *tsp;
delta_us = delta_ns / 1000;
/* read request fields using CO-RE; needs vmlinux.h/BTF */
__u32 data_len = 0;
__u32 cmd_flags = 0;
/* __data_len is usually a 32/64-bit; use CORE read to be safe */
data_len = ( __u32 ) BPF_CORE_READ((struct request *)reqp, __data_len);
cmd_flags = ( __u32 ) BPF_CORE_READ((struct request *)reqp, cmd_flags);
/* print: "<bytes> <flags_hex> <latency_us>" */
bpf_printk("%u %x %llu\n", data_len, cmd_flags, delta_us);
/* remove from map */
bpf_map_delete_elem(&start_map, &reqp);
return 0;
}

View File

@ -1,25 +0,0 @@
#define __TARGET_ARCH_arm64
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
// Map: key = struct request*, value = u64 timestamp
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__type(key, struct request *);
__type(value, u64);
__uint(max_entries, 1024);
} start SEC(".maps");
// Attach to kprobe for blk_start_request
SEC("kprobe/blk_start_request")
int BPF_KPROBE(trace_start, struct request *req)
{
u64 ts = bpf_ktime_get_ns();
bpf_map_update_elem(&start, &req, &ts, BPF_ANY);
return 0;
}
char LICENSE[] SEC("license") = "GPL";

View File

@ -0,0 +1,15 @@
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
SEC("xdp")
int print_xdp_data(struct xdp_md *ctx)
{
// 'data' is a pointer to the start of packet data
long data = (long)ctx->data;
bpf_printk("ctx->data = %lld\n", data);
return XDP_PASS;
}
char LICENSE[] SEC("license") = "GPL";

View File

@ -2,18 +2,75 @@
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
char LICENSE[] SEC("license") = "Dual BSD/GPL";
char LICENSE[] SEC("license") = "GPL";
SEC("kprobe/do_unlinkat")
int kprobe_execve(struct pt_regs *ctx)
{
bpf_printk("unlinkat created");
return 0;
}
SEC("kretprobe/do_unlinkat")
int kretprobe_execve(struct pt_regs *ctx)
{
bpf_printk("unlinkat returned\n");
unsigned long r15 = ctx->r15;
bpf_printk("r15: %lld", r15);
unsigned long r14 = ctx->r14;
bpf_printk("r14: %lld", r14);
unsigned long r13 = ctx->r13;
bpf_printk("r13: %lld", r13);
unsigned long r12 = ctx->r12;
bpf_printk("r12: %lld", r12);
unsigned long bp = ctx->bp;
bpf_printk("rbp: %lld", bp);
unsigned long bx = ctx->bx;
bpf_printk("rbx: %lld", bx);
unsigned long r11 = ctx->r11;
bpf_printk("r11: %lld", r11);
unsigned long r10 = ctx->r10;
bpf_printk("r10: %lld", r10);
unsigned long r9 = ctx->r9;
bpf_printk("r9: %lld", r9);
unsigned long r8 = ctx->r8;
bpf_printk("r8: %lld", r8);
unsigned long ax = ctx->ax;
bpf_printk("rax: %lld", ax);
unsigned long cx = ctx->cx;
bpf_printk("rcx: %lld", cx);
unsigned long dx = ctx->dx;
bpf_printk("rdx: %lld", dx);
unsigned long si = ctx->si;
bpf_printk("rsi: %lld", si);
unsigned long di = ctx->di;
bpf_printk("rdi: %lld", di);
unsigned long orig_ax = ctx->orig_ax;
bpf_printk("orig_rax: %lld", orig_ax);
unsigned long ip = ctx->ip;
bpf_printk("rip: %lld", ip);
unsigned long cs = ctx->cs;
bpf_printk("cs: %lld", cs);
unsigned long flags = ctx->flags;
bpf_printk("eflags: %lld", flags);
unsigned long sp = ctx->sp;
bpf_printk("rsp: %lld", sp);
unsigned long ss = ctx->ss;
bpf_printk("ss: %lld", ss);
return 0;
}

View File

@ -0,0 +1,18 @@
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
char LICENSE[] SEC("license") = "GPL";
SEC("kprobe/blk_mq_start_request")
int example(struct pt_regs *ctx)
{
u64 a = ctx->r15;
struct request *req = (struct request *)(ctx->di);
unsigned int something_ns = BPF_CORE_READ(req, timeout);
unsigned int data_len = BPF_CORE_READ(req, __data_len);
bpf_printk("data length %lld %ld %ld\n", data_len, something_ns, a);
return 0;
}

View File

@ -0,0 +1,18 @@
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
char LICENSE[] SEC("license") = "GPL";
SEC("kprobe/blk_mq_start_request")
int example(struct pt_regs *ctx)
{
u64 a = ctx->r15;
struct request *req = (struct request *)(ctx->di);
unsigned int something_ns = req->timeout;
unsigned int data_len = req->__data_len;
bpf_printk("data length %lld %ld %ld\n", data_len, something_ns, a);
return 0;
}

View File

@ -0,0 +1,42 @@
// SPDX-License-Identifier: GPL-2.0
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
/*
Information gained from reversing this (multiple kernel versions):
There is no point of
```llvm
tail call void @llvm.dbg.value(metadata ptr %0, metadata !60, metadata !DIExpression()), !dbg !70
```
and the first argument of passthrough is fucking useless. It just needs to be a distinct integer:
```llvm
%9 = tail call ptr @llvm.bpf.passthrough.p0.p0(i32 3, ptr %8)
```
*/
SEC("tp/syscalls/sys_enter_execve")
int handle_setuid_entry(struct trace_event_raw_sys_enter *ctx) {
// Access each argument separately with clear variable assignments
long int id = ctx->id;
bpf_printk("This is context field %d", id);
/*
* the IR to aim for is
* %2 = alloca ptr, align 8
* store ptr %0, ptr %2, align 8
* Above, %0 is the arg pointer
* %5 = load ptr, ptr %2, align 8
* %6 = getelementptr inbounds %struct.trace_event_raw_sys_enter, ptr %5, i32 0, i32 2
* %7 = load i64, ptr @"llvm.trace_event_raw_sys_enter:0:16$0:2:0", align 8
* %8 = bitcast ptr %5 to ptr
* %9 = getelementptr i8, ptr %8, i64 %7
* %10 = bitcast ptr %9 to ptr
* %11 = call ptr @llvm.bpf.passthrough.p0.p0(i32 0, ptr %10)
* %12 = load i64, ptr %11, align 8, !dbg !101
*
*/
return 0;
}
char LICENSE[] SEC("license") = "GPL";

121617
tests/c-form/vmlinux.h vendored

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,21 @@
// xdp_rewrite.c
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <linux/if_ether.h>
SEC("xdp")
int xdp_rewrite_mac(struct xdp_md *ctx)
{
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
struct ethhdr *eth = data;
if ((void*)(eth + 1) > data_end)
return XDP_PASS;
__u8 new_src[ETH_ALEN] = {0x02,0x00,0x00,0x00,0x00,0x02};
for (int i = 0; i < ETH_ALEN; i++) eth->h_source[i] = new_src[i];
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";

View File

@ -0,0 +1,31 @@
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
struct fake_iphdr {
unsigned short useless;
unsigned short tot_len;
unsigned short id;
unsigned short frag_off;
unsigned char ttl;
unsigned char protocol;
unsigned short check;
unsigned int saddr;
unsigned int daddr;
};
SEC("xdp")
int xdp_prog(struct xdp_md *ctx) {
unsigned long data = ctx->data;
unsigned long data_end = ctx->data_end;
if (data + sizeof(struct ethhdr) + sizeof(struct fake_iphdr) > data_end) {
return XDP_ABORTED;
}
struct fake_iphdr *iph = (void *)data + sizeof(struct ethhdr);
bpf_printk("%d", iph->saddr);
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";

View File

@ -0,0 +1,30 @@
import logging
from pythonbpf import bpf, section, bpfglobal, compile_to_ir
from pythonbpf import compile # noqa: F401
from vmlinux import TASK_COMM_LEN # noqa: F401
from vmlinux import struct_trace_event_raw_sys_enter # noqa: F401
from ctypes import c_int64, c_int32, c_void_p # noqa: F401
# from vmlinux import struct_uinput_device
# from vmlinux import struct_blk_integrity_iter
@bpf
@section("tracepoint/syscalls/sys_enter_execve")
def hello_world(ctx: struct_trace_event_raw_sys_enter) -> c_int64:
b = ctx.args
c = b[0]
print(f"This is context args field {c}")
return c_int64(0)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile_to_ir("args_test.py", "args_test.ll", loglevel=logging.INFO)
compile()

View File

@ -0,0 +1,22 @@
from vmlinux import XDP_PASS
from pythonbpf import bpf, section, bpfglobal, compile_to_ir
import logging
from ctypes import c_int64, c_void_p
@bpf
@section("kprobe/blk_mq_start_request")
def example(ctx: c_void_p) -> c_int64:
d = XDP_PASS # This gives an error, but
e = XDP_PASS + 0 # this does not
print(f"test1 {e} test2 {d}")
return c_int64(0)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile_to_ir("assignment_handling.py", "assignment_handling.ll", loglevel=logging.INFO)

View File

@ -0,0 +1,46 @@
from vmlinux import XDP_PASS, XDP_ABORTED
from vmlinux import (
struct_xdp_md,
)
from pythonbpf import bpf, section, bpfglobal, compile, compile_to_ir, struct
from ctypes import c_int64, c_ubyte, c_ushort, c_uint32, c_void_p
@bpf
@struct
class iphdr:
useless: c_ushort
tot_len: c_ushort
id: c_ushort
frag_off: c_ushort
ttl: c_ubyte
protocol: c_ubyte
check: c_ushort
saddr: c_uint32
daddr: c_uint32
@bpf
@section("xdp")
def ip_detector(ctx: struct_xdp_md) -> c_int64:
data = c_void_p(ctx.data)
data_end = c_void_p(ctx.data_end)
if data + 34 < data_end:
hdr = data + 14
iph = iphdr(hdr)
addr = iph.saddr
print(f"ipaddress: {addr}")
else:
return c_int64(XDP_ABORTED)
return c_int64(XDP_PASS)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile_to_ir("xdp_test_1.py", "xdp_test_1.ll")
compile()

View File

@ -1,4 +1,4 @@
from pythonbpf import bpf, map, section, bpfglobal, compile, struct
from pythonbpf import bpf, map, section, bpfglobal, compile, struct, compile_to_ir
from ctypes import c_void_p, c_int64, c_int32, c_uint64
from pythonbpf.maps import HashMap
from pythonbpf.helper import ktime
@ -71,4 +71,5 @@ def LICENSE() -> str:
return "GPL"
compile_to_ir("comprehensive.py", "comprehensive.ll")
compile()

View File

@ -1,4 +1,4 @@
from pythonbpf import bpf, struct, section, bpfglobal
from pythonbpf import bpf, struct, section, bpfglobal, compile
from pythonbpf.helper import comm
from ctypes import c_void_p, c_int64
@ -26,3 +26,6 @@ def hello(ctx: c_void_p) -> c_int64:
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile()

View File

@ -0,0 +1,44 @@
from pythonbpf import bpf, section, struct, bpfglobal, compile, map
from pythonbpf.maps import HashMap
from pythonbpf.helper import pid, comm
from ctypes import c_void_p, c_int64
@bpf
@struct
class val_type:
counter: c_int64
shizzle: c_int64
comm: str(16)
@bpf
@map
def last() -> HashMap:
return HashMap(key=val_type, value=c_int64, max_entries=16)
@bpf
@section("tracepoint/syscalls/sys_enter_clone")
def hello_world(ctx: c_void_p) -> c_int64:
obj = val_type()
obj.counter, obj.shizzle = 42, 96
comm(obj.comm)
t = last.lookup(obj)
if t:
print(f"Found existing entry: counter={obj.counter}, pid={t}")
last.delete(obj)
return 0 # type: ignore [return-value]
val = pid()
last.update(obj, val)
print(f"Map updated!, {obj.counter}, {obj.shizzle}, {val}")
return 0 # type: ignore [return-value]
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile()

View File

@ -0,0 +1,29 @@
from pythonbpf import bpf, section, bpfglobal, compile, struct
from ctypes import c_void_p, c_int64, c_uint64, c_uint32
from pythonbpf.helper import probe_read
@bpf
@struct
class data_t:
pid: c_uint32
value: c_uint64
@bpf
@section("tracepoint/syscalls/sys_enter_execve")
def test_probe_read(ctx: c_void_p) -> c_int64:
"""Test bpf_probe_read helper function"""
data = data_t()
probe_read(data.value, 8, ctx)
probe_read(data.pid, 4, ctx)
return 0
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile()

View File

@ -0,0 +1,25 @@
from pythonbpf import bpf, bpfglobal, section, BPF, trace_pipe
from ctypes import c_void_p, c_int64
from pythonbpf.helper import random
@bpf
@section("tracepoint/syscalls/sys_enter_clone")
def hello_world(ctx: c_void_p) -> c_int64:
r = random()
print(f"Hello, World!, {r}")
return 0 # type: ignore [return-value]
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
# Compile and load
b = BPF()
b.load()
b.attach_all()
trace_pipe()

View File

@ -0,0 +1,40 @@
from pythonbpf import bpf, section, bpfglobal, compile, struct
from ctypes import c_void_p, c_int64, c_uint32, c_uint64
from pythonbpf.helper import smp_processor_id, ktime
@bpf
@struct
class cpu_event_t:
cpu_id: c_uint32
timestamp: c_uint64
@bpf
@section("tracepoint/syscalls/sys_enter_execve")
def trace_with_cpu(ctx: c_void_p) -> c_int64:
"""Test bpf_get_smp_processor_id helper function"""
# Get the current CPU ID
cpu = smp_processor_id()
# Print it
print(f"Running on CPU {cpu}")
# Use it in a struct
event = cpu_event_t()
event.cpu_id = smp_processor_id()
event.timestamp = ktime()
print(f"Event on CPU {event.cpu_id} at time {event.timestamp}")
return 0
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile()

View File

@ -0,0 +1,31 @@
from pythonbpf import bpf, section, bpfglobal, compile
from ctypes import c_void_p, c_int64
from pythonbpf.helper import uid, pid
@bpf
@section("tracepoint/syscalls/sys_enter_execve")
def filter_by_user(ctx: c_void_p) -> c_int64:
"""Filter events by specific user ID"""
current_uid = uid()
# Only trace root user (UID 0)
if current_uid == 0:
process_id = pid()
print(f"Root process {process_id} executed")
# Or trace specific user (e.g., UID 1000)
if current_uid == 1002:
print("User 1002 executed something")
return 0
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile()

View File

@ -0,0 +1,93 @@
"""
Test struct values in HashMap.
This example stores a struct in a HashMap and reads it back,
testing the new set_value_struct() functionality in pylibbpf.
"""
from pythonbpf import bpf, map, struct, section, bpfglobal, BPF
from pythonbpf.helper import ktime, smp_processor_id, pid, comm
from pythonbpf.maps import HashMap
from ctypes import c_void_p, c_int64, c_uint32, c_uint64
import time
import os
@bpf
@struct
class task_info:
pid: c_uint64
timestamp: c_uint64
comm: str(16)
@bpf
@map
def cpu_tasks() -> HashMap:
return HashMap(key=c_uint32, value=task_info, max_entries=256)
@bpf
@section("tracepoint/sched/sched_switch")
def trace_sched_switch(ctx: c_void_p) -> c_int64:
cpu = smp_processor_id()
# Create task info struct
info = task_info()
info.pid = pid()
info.timestamp = ktime()
comm(info.comm)
# Store in map
cpu_tasks.update(cpu, info)
return 0 # type: ignore
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
# Compile and load
b = BPF()
b.load()
b.attach_all()
print("Testing HashMap with Struct Values")
cpu_map = b["cpu_tasks"]
cpu_map.set_value_struct("task_info") # Enable struct deserialization
print("Listening for context switches.. .\n")
num_cpus = os.cpu_count() or 16
try:
while True:
time.sleep(1)
print(f"--- Snapshot at {time.strftime('%H:%M:%S')} ---")
for cpu in range(num_cpus):
try:
info = cpu_map.lookup(cpu)
if info:
comm_str = (
bytes(info.comm).decode("utf-8", errors="ignore").rstrip("\x00")
)
ts_sec = info.timestamp / 1e9
print(
f" CPU {cpu}: PID={info.pid}, comm={comm_str}, ts={ts_sec:.3f}s"
)
except KeyError:
# No data for this CPU yet
pass
print()
except KeyboardInterrupt:
print("\nStopped")

View File

@ -0,0 +1,31 @@
from ctypes import c_int64, c_void_p
from pythonbpf import bpf, section, bpfglobal, compile_to_ir, compile
from vmlinux import struct_xdp_md
from vmlinux import XDP_PASS
@bpf
@section("xdp")
def print_xdp_dat2a(ct2x: struct_xdp_md) -> c_int64:
data = ct2x.data # 32-bit field: packet start pointer
print(f"ct2x->data = {data}")
return c_int64(XDP_PASS)
@bpf
@section("xdp")
def print_xdp_data(ctx: struct_xdp_md) -> c_int64:
data = ctx.data # 32-bit field: packet start pointer
something = c_void_p(data)
print(f"ctx->data = {something}")
return c_int64(XDP_PASS)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile_to_ir("i32_test.py", "i32_test.ll")
compile()

View File

@ -0,0 +1,24 @@
from ctypes import c_int64
from pythonbpf import bpf, section, bpfglobal, compile
from vmlinux import struct_xdp_md
from vmlinux import XDP_PASS
import logging
@bpf
@section("xdp")
def print_xdp_data(ctx: struct_xdp_md) -> c_int64:
data = 0
data = ctx.data # 32-bit field: packet start pointer
something = 2 + data
print(f"ctx->data = {something}")
return c_int64(XDP_PASS)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile(logging.INFO)

View File

@ -0,0 +1,24 @@
from ctypes import c_int64
from pythonbpf import bpf, section, bpfglobal, compile, compile_to_ir
from vmlinux import struct_xdp_md
from vmlinux import XDP_PASS
import logging
@bpf
@section("xdp")
def print_xdp_data(ctx: struct_xdp_md) -> c_int64:
data = c_int64(ctx.data) # 32-bit field: packet start pointer
something = 2 + data
print(f"ctx->data = {something}")
return c_int64(XDP_PASS)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile_to_ir("i32_test_fail_2.py", "i32_test_fail_2.ll")
compile(logging.INFO)

View File

@ -0,0 +1,53 @@
from pythonbpf import bpf, section, bpfglobal, BPF, trace_pipe
from pythonbpf import compile # noqa: F401
from vmlinux import struct_pt_regs
from ctypes import c_int64, c_int32, c_void_p # noqa: F401
@bpf
@section("kprobe/do_unlinkat")
def kprobe_execve(ctx: struct_pt_regs) -> c_int64:
r15 = ctx.r15
r14 = ctx.r14
r13 = ctx.r13
r12 = ctx.r12
bp = ctx.bp
bx = ctx.bx
r11 = ctx.r11
r10 = ctx.r10
r9 = ctx.r9
r8 = ctx.r8
ax = ctx.ax
cx = ctx.cx
dx = ctx.dx
si = ctx.si
di = ctx.di
orig_ax = ctx.orig_ax
ip = ctx.ip
cs = ctx.cs
flags = ctx.flags
sp = ctx.sp
ss = ctx.ss
print(f"r15={r15} r14={r14} r13={r13}")
print(f"r12={r12} rbp={bp} rbx={bx}")
print(f"r11={r11} r10={r10} r9={r9}")
print(f"r8={r8} rax={ax} rcx={cx}")
print(f"rdx={dx} rsi={si} rdi={di}")
print(f"orig_rax={orig_ax} rip={ip} cs={cs}")
print(f"eflags={flags} rsp={sp} ss={ss}")
return c_int64(0)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
b = BPF()
b.load()
b.attach_all()
trace_pipe()

View File

@ -0,0 +1,27 @@
from vmlinux import struct_request, struct_pt_regs
from pythonbpf import bpf, section, bpfglobal, compile_to_ir, compile
import logging
from ctypes import c_int64
@bpf
@section("kprobe/blk_mq_start_request")
def example(ctx: struct_pt_regs) -> c_int64:
a = ctx.r15
req = struct_request(ctx.di)
d = req.__data_len
b = ctx.r12
c = req.timeout
print(f"data length {d} and {c} and {a}")
print(f"ctx arg {b}")
return c_int64(0)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile_to_ir("requests.py", "requests.ll", loglevel=logging.INFO)
compile()

View File

@ -0,0 +1,21 @@
from vmlinux import struct_pt_regs
from pythonbpf import bpf, section, bpfglobal, compile_to_ir
import logging
from ctypes import c_int64
@bpf
@section("kprobe/blk_mq_start_request")
def example(ctx: struct_pt_regs) -> c_int64:
req = ctx.di
print(f"data length {req}")
return c_int64(0)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile_to_ir("requests2.py", "requests2.ll", loglevel=logging.INFO)

View File

@ -44,4 +44,4 @@ def LICENSE() -> str:
compile_to_ir("simple_struct_test.py", "simple_struct_test.ll", loglevel=logging.DEBUG)
# compile()
compile()

View File

@ -0,0 +1,29 @@
import logging
from pythonbpf import bpf, section, bpfglobal, compile_to_ir
from pythonbpf import compile # noqa: F401
from vmlinux import TASK_COMM_LEN # noqa: F401
from vmlinux import struct_trace_event_raw_sys_enter # noqa: F401
from ctypes import c_int64, c_int32, c_void_p # noqa: F401
# from vmlinux import struct_uinput_device
# from vmlinux import struct_blk_integrity_iter
@bpf
@section("tracepoint/syscalls/sys_enter_execve")
def hello_world(ctx: struct_trace_event_raw_sys_enter) -> c_int64:
b = ctx.id
print(f"This is context field {b}")
return c_int64(0)
@bpf
@bpfglobal
def LICENSE() -> str:
return "GPL"
compile_to_ir("struct_field_access.py", "struct_field_access.ll", loglevel=logging.INFO)
compile()

View File

@ -144,7 +144,8 @@ mkdir -p examples
cd examples || exit 1
echo "Fetching example files list..."
FILES=$(curl -s "https://api.github.com/repos/pythonbpf/Python-BPF/contents/examples" | grep -o '"path": "examples/[^"]*"' | awk -F'"' '{print $4}')
FILES=$(curl -s "https://api.github.com/repos/pythonbpf/Python-BPF/contents/examples" | grep -E -o '"path": "examples/[^"]*(\.md|\.ipynb|\.py)"' | awk -F'"' '{print $4}')
BCC_EXAMPLES=$(curl -s "https://api.github.com/repos/pythonbpf/Python-BPF/contents/BCC-Examples" | grep -E -o '"path": "BCC-Examples/[^"]*(\.md|\.ipynb|\.py)"' | awk -F'"' '{print $4}')
if [ -z "$FILES" ]; then
echo "Failed to fetch file list from repository. Using fallback method..."
@ -166,11 +167,20 @@ if [ -z "$FILES" ]; then
curl -s -O "https://raw.githubusercontent.com/pythonbpf/Python-BPF/master/examples/$example"
done
else
mkdir examples && cd examples
for file in $FILES; do
filename=$(basename "$file")
echo "Downloading: $filename"
curl -s -o "$filename" "https://raw.githubusercontent.com/pythonbpf/Python-BPF/master/$file"
done
cd ..
mkdir BCC-Examples && cd BCC-Examples
for file in $BCC_EXAMPLES; do
filename=$(basename "$file")
echo "Downloading: $filename"
curl -s -o "$filename" "https://raw.githubusercontent.com/pythonbpf/Python-BPF/master/$file"
done
cd ..
fi
cd "$WORK_DIR" || exit 1