FlashInfer-Bench: Building the Virtuous Cycle for AI-driven LLM Systems
A standardized, closed-loop framework that connects kernel generation, benchmarking, and deployment
A standardized, closed-loop framework that connects kernel generation, benchmarking, and deployment
Routing algorithm for real-world LLM services—balancing TTFT SLOs and cost via selective API offload.
Production-oriented cache replacement algorithm for VMware vSAN.
A middle-layer API that bridges local web agents with the browser environment; Overleaf & Google Workspace integrations.