Add ZCU102 (UltraScale+ A53 EL3) bare-metal wolfIP port with GEM3#121
Open
dgarske wants to merge 6 commits into
Open
Add ZCU102 (UltraScale+ A53 EL3) bare-metal wolfIP port with GEM3#121dgarske wants to merge 6 commits into
dgarske wants to merge 6 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new bare-metal AArch64 (Cortex-A53 EL3) wolfIP port targeting the Xilinx ZCU102 board, including a clean-room Cadence GEM3 + DP83867 PHY driver, minimal EL3 MMU/GIC/UART bring-up, and supporting JTAG/bootgen/SD tooling.
Changes:
- Introduces
src/port/zcu102/(startup vectors, MMU setup, GICv2 driver, polled UART, GEM3 Ethernet + DP83867 PHY, UDP echo + DHCP demo, build/link scripts). - Adds ZCU102 JTAG loader scripts (generic
tools/scripts/zcu102/and port-specificsrc/port/zcu102/jtag/). - Adds BOOT.BIN generation templates and an SD flashing helper.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/scripts/zcu102/README.md | Documents the generic ZCU102 xsdb loader pattern and constraints. |
| tools/scripts/zcu102/jtag_load.tcl | Generic xsdb JTAG loader for AArch64 EL3 apps (OCM load + RVBAR loop + entry jump). |
| src/port/zcu102/.gitignore | Ignores local build artifacts for the ZCU102 port. |
| src/port/zcu102/board.h | Board-specific base addresses/IRQs/clock/reset regs and default MAC. |
| src/port/zcu102/config.h | UDP-focused wolfIP configuration for the ZCU102 port. |
| src/port/zcu102/flash_sd.sh | Helper to copy BOOT.BIN to an SD boot partition with safety checks. |
| src/port/zcu102/gem.h | Public GEM3 + MDIO API surface for the port. |
| src/port/zcu102/gem.c | Clean-room GEM3 driver (BD rings, MDIO, polled RX/TX integration with wolfIP). |
| src/port/zcu102/gic.h | Minimal GICv2 interface and IRQ dispatch hooks. |
| src/port/zcu102/gic.c | GIC-400 bring-up and dispatch implementation (plus polled dispatch helper). |
| src/port/zcu102/jtag/boot.sh | Port-local wrapper to build a flat binary and invoke xsdb boot sequence. |
| src/port/zcu102/jtag/boot.tcl | Port-local xsdb sequence to init PS, load OCM, and run the app. |
| src/port/zcu102/jtag/boot_iter.sh | Developer iteration helper (power-cycle + hw_server restart + boot). |
| src/port/zcu102/main.c | Demo app (wolfIP init, DHCP, UDP echo) + wrapped memset/memcpy + exception reporting. |
| src/port/zcu102/mmu.h | Declares EL3 MMU enable entrypoint. |
| src/port/zcu102/mmu.c | Static EL3 page tables and MMU enable sequence (TCR/MAIR/TTBR setup). |
| src/port/zcu102/phy_dp83867.h | DP83867 PHY init/link-status API. |
| src/port/zcu102/phy_dp83867.c | DP83867 configuration (strap fix, delays, AN/link polling) via MDIO. |
| src/port/zcu102/README.md | Port-level documentation (features, build/boot workflow, expected output). |
| src/port/zcu102/startup.S | EL3 vectors + startup (BSS clear, MMU enable, IRQ trampoline, exception trampolines). |
| src/port/zcu102/target.ld | AArch64 linker script for OCM-based layout and special sections. |
| src/port/zcu102/timer.h | Generic timer-based delay utilities. |
| src/port/zcu102/uart.h | UART API for the port. |
| src/port/zcu102/uart.c | Polled Cadence UART0 driver and small print helpers. |
| src/port/zcu102/bootgen/boot.bif | BOOT.BIN template for bootgen. |
| src/port/zcu102/bootgen/build_bootbin.sh | Script to render the BIF template and run bootgen. |
| src/port/zcu102/Makefile | Port-local build (app.elf + BOOT.BIN) and core compilation strategy. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+160
to
+172
| el3_irq_trampoline: | ||
| sub sp, sp, #(16 * 16) | ||
| stp x0, x1, [sp, #(0 * 16)] | ||
| stp x2, x3, [sp, #(1 * 16)] | ||
| stp x4, x5, [sp, #(2 * 16)] | ||
| stp x6, x7, [sp, #(3 * 16)] | ||
| stp x8, x9, [sp, #(4 * 16)] | ||
| stp x10, x11, [sp, #(5 * 16)] | ||
| stp x12, x13, [sp, #(6 * 16)] | ||
| stp x14, x15, [sp, #(7 * 16)] | ||
| stp x16, x17, [sp, #(8 * 16)] | ||
| stp x18, x29, [sp, #(9 * 16)] | ||
| str x30, [sp, #(10 * 16)] |
Comment on lines
+138
to
+147
| /* L2_PERIPH: 3..4 GB range. All Device-nGnRnE except the last | ||
| * 2 MB block which contains OCM (0xFFFC0000..0xFFFFFFFF) and | ||
| * must be Normal+executable so we can fetch our code from OCM. */ | ||
| for (i = 0; i < 511; i++) { | ||
| addr = 3ULL * L1_BLOCK_SIZE + (uint64_t)i * L2_BLOCK_SIZE; | ||
| L2_PERIPH[i] = BLOCK_DEVICE(addr); | ||
| } | ||
| L2_PERIPH[511] = BLOCK_NORMAL(3ULL * L1_BLOCK_SIZE | ||
| + 511ULL * L2_BLOCK_SIZE); | ||
|
|
| * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1335, USA | ||
| * | ||
| * GIC-400 (ARM GICv2) minimal driver for Cortex-A53 EL3 on ZynqMP. | ||
| * Configures all SPIs as Group 1, level-triggered, targeted at CPU0, |
Comment on lines
+21
to
+27
| * Cadence GEM driver for ZynqMP GEM3 (on-board RJ45 on ZCU102). | ||
| * | ||
| * - 32-bit DMA addressing (DDR low bank only, < 4 GB). | ||
| * - Polled TX (matches existing wolfIP port pattern, simplest cert). | ||
| * - IRQ-driven RX (GIC SPI 63 - see board.h). | ||
| * - BDs and frame buffers in .dma_buffers, MMU-marked Device-nGnRnE. | ||
| * |
Comment on lines
+3
to
+16
| * Memory map: | ||
| * DDR low : 2 GB @ 0x00000000 (FSBL hands control with DDR initialized) | ||
| * OCM : 256 KB @ 0xFFFC0000 (not used by this app) | ||
| * | ||
| * App layout in DDR: | ||
| * 0x00000000 - 0x000FFFE0 vectors, .text, .rodata, .data, .bss | ||
| * (linker just packs them in order; stack at top) | ||
| * 0x00100000 _stack_top (1 MB) | ||
| * 0x00200000 - 0x003FFFFF .dma_buffers (2 MB, 2 MB-aligned, mapped | ||
| * Device-nGnRnE by the MMU table in mmu.c) | ||
| * 0x00400000+ free for future use (e.g. heap) | ||
| * | ||
| * The 2 MB alignment of .dma_buffers is required because the MMU page | ||
| * tables flip its attribute at L2 block (2 MB) granularity. |
…P83867 + UDP echo
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First Cortex-A / aarch64 port. Verified end-to-end on hardware: DHCP, ping, bidirectional UDP echo on port 7.
Features
src/port/zcu102/port: GCC bare-metal, single Cortex-A53 at EL3, no Xilinx Standalone BSP, noxparameters.h.MAX_TCPSOCKETS=2only for the timer-heap minimum; app opens no TCP sockets).jtag/boot.sh+jtag/boot.tcl) - OCM-only iteration, no SD swap.bootgen/template +flash_sd.shfor SD boot via stock FSBL.Notable fixes captured during bring-up
DMACR[30]must be clear with 8-byte BDs (setting it switches GEM to 16-byte BD format withaddr_hi; MAC then writes frames to bogus high addresses, counted but never delivered). 64-bit AXI bus width comes fromNWCFG[21]alone.cache_cleanafter every BD recycle ingem_israndeth_poll, andcache_invalbefore readingTXBUF_USEDineth_send(MAC writes USED-back to DDR, not coherent with CPU D-cache; without the inval the TX spin loop times out and wedges sustained UDP TX).USED|WRAP|LAST, RXWRAP|OWN_SW) to keep MAC from walking uninitialised priority queues.memset/memcpywrapped via-Wl,--wrapto avoiddc zvahang on this A53 setup (even withSCTLR_EL3.DZE=1).0xFFE00000-0xFFFFFFFF) mapped Normal-WB executable so code runs from OCM after MMU enable.Known limitations
GICC_IARack works when polled). Worked around by drivinggem_isr()frometh_poll()in the main loop. Real root cause is open.MAX_TCPSOCKETS=2is the minimum the current wolfIP core allows (MAX_TIMERS = MAX_TCPSOCKETS * 3); upstream follow-up should decouple.Test
make -C src/port/zcu102 CROSS_COMPILE=aarch64-none-elf-app.elf(or SD-boot viaBOOT.BIN), watch UART0 @ 115200.ping <ip>andnc -u <ip> 7.