Rackera

Fpga & Trading Stack Specialist

RackeraContract
PennsylvaniaGC, US Citizen, H1B
15 - 25 YearsMar 16th, 2026
91 ViewsBe an Early Applicant
Required Skillset:
VerilogQuartusVivadoQuestaC++20C++17Xilinx VersalVHDLperfftraceSystemVerilogschedulerPCIeLinux kernel internalsIRQsRCUhuge pagesCPU pinningNUMA engineeringcache topology optimizationrdtsc/tsc synchronizationPTP / IEEE-1588DPDKSolarflare OpenOnloadMellanox VMARDMA (RoCE, iWARP)multicast market-data optimizationcustom TCP/UDP stacksNIC firmware tuningexchange connectivity stacksXilinx UltraScale+Intel StratixIntel AgilexModelSimDMAHBMon-NIC processingFPGA feed handlersorder gatewayspacket filteringreal-time LinuxBIOS tuningPCIe lane configurationSR-IOVHugeTLBtransparent huge pagesCPU microarchitecture tuningflame graphseBPFhardware timestampingnanosecond-level profilingjitter eliminationdeterministic system designexchange protocols (NASDAQ, NYSE, CME, ICE, OPRA)market-data normalizationorder routing enginespre-trade risk systemstick-to-trade optimizationmicrowave / millimeter-wave trading networksGPS-disciplined clockscustom NIC firmwareco-location data-center optimizationbare-metal Kubernetes for HFTP4 programmable networkingSmartNIC developmentASIC prototyping

Job Description

Please reach at
xxxxxxxxxxxxxxx
Urgent requirement

Job Title: FPGA & Trading Stack Specialist 
Experience Required: 14+ years  
Assignment Duration: 12+ Months 
Engagement Type: Contract 
Work Location: Bala Cynwyd, PA 
Hourly Rate: $150/Hr. 
End Client: SIG 
Key Responsibilities:  • Design, code, and optimize high-performance C++17/20 trading systems 
including market-data handlers, order routing engines, and pre-trade risk 
services 
• Build lock-free, wait-free, cache-aligned software components and custom 
memory allocators 
• Develop exchange protocol stacks (NASDAQ, NYSE, CME, ICE, OPRA) and 
high-throughput feed normalization pipelines 
• Deliver measurable improvements in tick-to-trade latency, tail latency, and 
throughput 
• Engineer bare-metal, deterministic Linux environments optimized for real
time trading workloads 
• Perform kernel, driver, and interrupt-path optimization including IRQ 
routing, RCU tuning, scheduler tuning, and context-switch minimization 
• Implement CPU isolation, NUMA locality strategies, cache-coherent 
layouts, and huge-page memory architectures 
• Produce stable, low-jitter execution profiles across trading systems 
• Architect and implement kernel-bypass networking stacks using DPDK, 
Mellanox VMA, Solarflare OpenOnload 
• Develop RDMA-enabled and multicast market-data pipelines 
• Tune NIC firmware, DMA paths, PCIe configurations, and network queues 
• Build and maintain exchange connectivity platforms and colocation
optimized data paths 
• Design and develop FPGA-accelerated feed handlers, order gateways, and 
packet-filtering engines 
• Implement ultra-low-latency pipelines using Xilinx UltraScale+/Versal or 
Intel Stratix/Agilex platforms 
• Collaborate on hardware/software co-design including PCIe, DMA, HBM, 
and SmartNIC architectures 
• Deliver nanosecond-scale latency improvements through hardware offload 
• Engineer deterministic trading platforms where timing, jitter, and physical 
constraints are first-class design inputs 
• Design systems accounting for cache behavior, memory latency, bus 
contention, and hardware clocks 
• Apply PTP / IEEE-1588 synchronization, hardware timestamping, and rdtsc
based measurement frameworks 
• Build and maintain nanosecond-resolution profiling, tracing, and telemetry 
tooling 
• Use perf, eBPF, ftrace, flame graphs, and hardware counters to isolate 
latency 
• Drive continuous reduction of variance, tail latency, and execution jitter 
• Work directly with traders, quants, and exchange operations teams to 
support strategy requirements 
• Optimize platform behavior for market-data ingestion, order flow, and 
pre-trade risk controls 
• Support production environments with rapid latency triage and 
optimization cycles 
Required Technical 
Expertise:  
• 15+ years in high-performance or trading systems 
• Prior experience in HFT, exchanges, or market-data firms 
• Demonstrated history of nanosecond-level optimization 
• Deep coding background + hardware adjacency 
• Comfortable debugging production systems under live trading 
conditions 
• Modern C++17/20 (lock-free, cache-aligned, zero-copy architectures) 
• Linux kernel internals (scheduler, IRQs, RCU, huge pages) 
• CPU pinning, NUMA engineering, cache topology optimization 
• rdtsc/tsc synchronization, PTP / IEEE-1588 
• Kernel bypass: DPDK, Solarflare OpenOnload, Mellanox VMA 
• RDMA (RoCE, iWARP) 
• Multicast market-data optimization 
• Custom TCP/UDP stacks 
• NIC firmware tuning 
• Exchange connectivity stacks 
• Xilinx UltraScale+, Alveo, Versal 
• Intel Stratix, Agilex 
• Vivado, Quartus, ModelSim, Questa 
• Verilog / SystemVerilog / VHDL 
• PCIe, DMA, HBM, on-NIC processing 
• FPGA feed handlers, order gateways, packet filtering 
• Real-time Linux 
• BIOS tuning 
• PCIe lane configuration 
• SR-IOV 
• HugeTLB, transparent huge pages 
• CPU microarchitecture tuning 
• perf, ftrace, flame graphs, eBPF 
• Hardware timestamping 
• Nanosecond-level profiling 
• Jitter elimination 
• Deterministic system design 
• Exchange protocols: NASDAQ, NYSE, CME, ICE, OPRA 
• Market-data normalization 
• Order routing engines 
• Pre-trade risk systems 
• Tick-to-trade optimization 
• Microwave / millimeter-wave trading networks 
• GPS-disciplined clocks 
• Custom NIC firmware 
• Co-location data-center optimization 
• Bare-metal Kubernetes for HFT 
• P4 programmable networking 
• SmartNIC development 
• ASIC prototyping

Similar Jobs

M365 & Collaboration Engineer

New Jersey

Mar 16th, 2026

Java Full Stack Developer.

Remote

Mar 16th, 2026

Aci Mts & Upf Lead Engineer

North Carolina

Mar 16th, 2026

Java Full Stack Developer

AZ

Mar 16th, 2026

Full Stack Development Lead

Texas

Mar 16th, 2026