Tailored for e-commerce, accelerating new growth opportunities


Hyper-personalized search and discovery for every user journey
Acquire high-value customers with efficiency and precision
Transform content creation with generative intelligence
Our proprietary, high-performance LLM inference framework is purpose-built for the DeepSeek model family. Featuring advanced system integration—PD separation, EPLB (priority scheduling), DeepEP (parallel execution), and DeepGEEM (granular memory management)—we deliver over 50% throughput gains and halve end-to-end latency in multi-GPU, multi-node environments. This robust foundation enables scalable, real-time AI deployment for mission-critical business scenarios.


















ByteArk is a pioneering technology company specializing in AI infrastructure and enterprise solutions, headquartered in Hangzhou Future Science City. We focus on LLM inference optimization, industry-grade AI applications, and high-performance GPU computing, building a unified platform that bridges foundational computing and business innovation.
Driven by an engineering-first culture, over 70% of our team are technical experts from top global universities and Fortune 500 tech leaders. ByteArk delivers reliable, scalable AI compute to clients worldwide. Recognized as a National High-Tech Enterprise and Zhejiang Provincial Innovation Leader, we hold 100+ patents and software copyrights, rapidly expanding our global AI infrastructure footprint.
Create tenfold value, take modest returns, give back to society
We ground every decision in objective reality, deep analysis, and rational thought. True value is created only through wisdom and facts.
We embrace lifelong learning and a holistic view—understanding business, clients, and markets to drive innovation and adapt to change.
We listen deeply, communicate with clarity, and connect needs with solutions. Effective communication is the bridge to value creation.
We pursue excellence, set ambitious goals, and continuously raise the bar. Only by striving for the best can we deliver exponential value.
We uphold honesty, integrity, and the courage to acknowledge and correct mistakes. Integrity is the bedrock of sustainable value and our promise to society.
















'Entrepreneurship is like sailing: you need a distant destination, but must also discover new islands for resources along the way.' CEO David, a serial entrepreneur, founded ByteArk in 2018 after a career as an IT engineer in the semiconductor industry, leading smartphone projects.

A visionary entrepreneur of the 1980s generation. Founded multiple global businesses with annual revenues exceeding $20M. Early blockchain pioneer since 2014, specializing in trading and capital management. In 2018, founded ByteArk, managing over $40M in revenue and $100M+ in digital assets.









专注于 推理执行阶段 本身的效率与执行路径优化,包括 Prefill/Decode 阶段的解耦、缓存调度、采样优化等。
1. 负责 LLM 推理系统的执行路径、资源调度与通信模块的系统级优化; 2. 设计并实现支持大规模多卡部署的调度执行架构,提升系统吞吐能力; 3. 优化通信链路与数据传输,减少跨节点通信延迟与带宽瓶颈; 4. 推进混合精度策略(如 FP16、BF16、INT8)在推理框架中的高效应用; 5. 支持并推动开源或自研推理框架(如 vLLM、SGLang)在系统层的深度性能演进。 职位要求: 1. 本科及以上学历,计算机科学、人工智能、软件工程或相关专业; 2. 熟悉主流推理框架,具备 vLLM、SGLang、TensorRT-LLM 等推理框架的优化经验者优先; 3. 熟悉通信优化,具备 NCCL、NVSHMEM、RDMA 等通信库的使用经验,了解通信开销的优化方法; 4. 理解资源管理机制,熟悉任务调度、并发控制、NUMA 架构、CPU/GPU 亲和性优化等系统层优化手段; 5. 具备系统级性能瓶颈分析能力,能够跨模块主导复杂性能问题的定位与解决,推动整体性能优化闭环。
关注推理框架本身的底层基础设施与系统结构,如资源分配、跨节点通信、GPU 编排、混合精度计算等。
1. 负责 LLM 推理系统的执行路径、资源调度与通信模块的系统级优化; 2. 设计并实现支持大规模多卡部署的调度执行架构,提升系统吞吐能力; 3. 优化通信链路与数据传输,减少跨节点通信延迟与带宽瓶颈; 4. 推进混合精度策略(如 FP16、BF16、INT8)在推理框架中的高效应用; 5. 支持并推动开源或自研推理框架(如 vLLM、SGLang)在系统层的深度性能演进。 职位要求: 1. 本科及以上学历,计算机科学、人工智能、软件工程或相关专业; 2. 熟悉主流推理框架,具备 vLLM、SGLang、TensorRT-LLM 等推理框架的优化经验者优先; 3. 熟悉通信优化,具备 NCCL、NVSHMEM、RDMA 等通信库的使用经验,了解通信开销的优化方法; 4. 理解资源管理机制,熟悉任务调度、并发控制、NUMA 架构、CPU/GPU 亲和性优化等系统层优化手段; 5. 具备系统级性能瓶颈分析能力,能够跨模块主导复杂性能问题的定位与解决,推动整体性能优化闭环。
