GPU, CUDA Programming,
and High Performance Computing

EXPERIENCES.

Tron Future Tech

Senior Software Engineer

  • GPU Acceleration for DSP Algorithms
    • Develop CUDA kernels for image processing and digital signal processing.
    • Kernel profiling and optimization for embedded devices.
    • Research on parallel computing and CUDA new features.
  • Deep Learning Model Deployment
    • Optimize DL models with ONNX.
    • Deploy DL models using inference frameworks such as ONNX Runtime and TensorRT.
  • Projects
    • Digital Signal Processing:
      • Developed kernel function for a super-resolution DOA estimation flow, utilizing kernel fusion, warp-level optimization, and other GPU techniques to achieve a GPU execution time of under 100 μs.
    • Computer Vision:
      • Implemented a hardware-accelerated Connected Component Labeling (CCL) algorithm, achieving a 10x speedup on embedded devices in different clock rates.
    • CI/CD:
      • Deployed the entire project's CI pipeline to embedded devices using GitLab and Docker, and developed unit tests to achieve automated testing.
      • Building cross-architecture (x86/ARM) and cross-GPU hardware compilation environments.
  • Skills: C++, CUDA, Assembly language, Git, ONNX, TensorRT.
Hsinchu, Taiwan
Apr. 2024 - Current

Synopsys

Application Engineer

  • Power Integrity Analysis
    • Analyzed IR-drop and power delivery network using in-design tools.
    • Collaborated with Ansys on integrating Redhawk/Redhawk-SC with internal flows.
  • Tool Development
    • Developed GUI for internal debugging with Qt.
    • Automated workflows and toolchains using Python.
  • Skills: Python, Qt, TCL, ICC2/Fusion Compiler, APR flow
Zhubei, Taiwan
Oct. 2022 - Mar. 2024

X-Epic

EDA Software Engineer

  • Compiler Front-End Development
    • Maintained and enhanced SystemVerilog database for front-end simulator.
    • Utilized design patterns for modular and maintainable architecture.
  • Testing
    • Established and maintained local unit testing framework.
  • Skills: C++, Git, SystemVerilog, Design Pattern
Zhubei, Taiwan
Dec. 2021 - Jul. 2022

Mediatek Inc.

Algorithm Engineer

  • RF Algorithm Development
    • Implemented advanced RF algorithms for 5G modem architecture (sub-6 / mmWave).
    • Conducted trade-off analysis for different system platforms.
    • Provided technical support for clients such as XIAOMI, OPPO, and SAMSUNG.
  • Tool Automation & GUI
    • Developed internal tool GUIs to aid algorithm simulation and debugging.
    • Automated testing and workflow using Python scripts.
  • Skills: Matlab, C, Python
Hsinchu, Taiwan
Mar. 2020 - Dec. 2021

EDUCATIONS.

National Yang Ming Chiao Tung University

MS in Electronics Engineering

Communication Electronics and Signal Processing Laboratory (CommLab)

  • Courses:
    • Computer Science: Operating Systems, Computer Architecture, Parallel Programming, Assembly Language, IC lab
    • Digital Signal Processing: Advanced Digital Signal Processing, Digital Image Processing, Detection and Estimation Theory
    • Algorithms & AI: Advanced Algorithms, Machine Learning, Deep Learning
Hsinchu, TW
Jul. 2017 – Sep. 2019

Shanghai Jiao Tong University

Exchange Program

  • Courses: Machine Learning
Shanghai, CN
Feb. 2017 – Jul. 2017

National Sun Yat-Sen University

BS in Electrical Engineering

  • Graduate in advance. (GPA: 3.85/4.0)
  • Courses: HDL, VLSI, Probability and Statistics, Digital Signal Processing
Kaohsiung, TW
Sep. 2013 – Feb. 2017

SKILLS.

Programming Languages

  • C++
  • Python
  • CUDA
  • MATLAB

Frameworks & Libraries

  • CUDA: cuBLAS, cuSolver, CUTLASS, cuFFT, cuFFTDx
  • Deep Learning: TensorFlow, PyTorch, Keras, ONNX, TensorRT
  • GUI: Qt6

Tools & Platforms

  • Git, Docker, CMake

Selected Online Courses

  • Stanford CS336 - Language Modeling from Scratch, Tatsunori Hashimoto, Percy Liang
  • NTHU OCW - 平行程式 (Parallel Programming), 周志遠
  • 彙編語言 (Assembly Language), 賀利堅, 王爽

CONTACT.