Machine Learning Engineer

Otter.ai

About

I am Yi-Te Hsu, a machine learning engineer at Otter.ai. At Otter.ai, I focus on using speech and NLP techniques to improve the automatic video/ audio transcription and summarization system.

I have experience in working on ML in industries and academic institutions. At Apple Inc., I worked on model efficiency projects to optimize the neural machine translation model. Before I came to the U.S., I conducted research on speech processing with Dr. Yu Tsao at Academia Sinica. I also collaborated with Prof. Frank Rudzicz at University of Toronto (UofT) and Vector Institute as a visiting researcher. I worked on detecting pathological voices and identifying Alzheimer’s disease.

I am excited about using software and machine learning skills to solve real-world problems!

Interests

  • Machine Learning
  • NLP and Speech Procssing
  • Model Efficiency
  • Software Development

Education

  • M.S. in Computer Science, 2020

    Johns Hopkins University

  • Visiting Student Researcher, 2018

    University of Toronto

  • BSc in Electrical Engineering, 2017

    National Taiwan University

Experience

 
 
 
 
 

Machine Learning Engineer

Otter.ai

Apr 2021 – Present California
  • Applying speech and NLP techniques to improve the automatic video/ audio transcription and summarization system.
 
 
 
 
 

Machine Learning Engineer Intern

Apple Inc.

Jun 2020 – Aug 2020 California
  • Surveyed and implemented the state-of-the-art model efficiency techniques for deep neural networks.
  • Proposed a more efficient inference architecture by applying knowledge distillation, simpler architecture and pruning.
  • Achieved up to 109% speedup and reduced the number of parameters by 25% while maintaining the same translation quality.
 
 
 
 
 

Research Intern

Vector Institute; University of Toronto

Sep 2018 – Dec 2018 Toronto
  • Developed early pathological voice detection models by speech processing and DL techniques (MFCCs, Filter banks, LSTM).
  • Built a robust system that can solve the channel mismatch problem between different devices, which increased the target domain PR-AUC from 0.84 to 0.94, through an unsupervised domain adaptation method, domain adversarial training.
  • Proposed a transfer learning method to detect dementia in Mandarin by transferring feature domains from Mandarin to English.
  • Achieved multi-language application by combining algorithms and models from different languages.
 
 
 
 
 

Research Assistant

Academia Sinica

Feb 2018 – Jul 2019 Taipei
  • Proposed a quantized neural network (EOFP-QNN) that achieves a 4x compression rate by quantizing floating-point weights.
  • Developed IA-Net, which simultaneously compresses the model and accelerates the inference process by 1.2x.
  • Integrated and optimized deep learning-based models (LSTM, FCN …) for speech enhancement and various signal processing tasks.
  • Developed ML models and tools for disease detection and assistive speaking system by collaborating with the doctors in the hospital.
 
 
 
 
 

Data Scientist Intern

Mobagel

Jun 2016 – Feb 2017 Taipei
  • Applied ML techniques and statistic models to extract core information from different types of IoT data.
  • Predicted the office space occupancy rate with the detected data from real-time sensors.
  • Deployed the machine learning models (Random forest, SVM, Logistic regression …) to products.
  • Utilized clustering and data visualization techniques to detect anomalous samples.

Recent Publications

Efficient Inference For Neural Machine Translation. accepted to SustaiNLP Workshop at EMNLP 2020, 2020.

PDF

IA-NET: Acceleration and Compression of Speech Enhancement using Integer-adder Deep Neural Network. accepted to Annual Conference of the International Speech Communication Association (INTERSPEECH 2019), 2019.

PDF

Detecting dementia in Mandarin Chinese using transfer learning from a parallel corpus. accepted to Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019), 2018.

PDF

Robustness against the channel effect in pathological voice detection. accepted to Machine Learning for Health Workshop at NIPS 2018, 2018.

PDF

A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN). accepted to IEEE Spoken Language Technology conference (IEEE SLT), 2018.

PDF

Projects

COVID-19 News Website

A COVID-19 news website containing the latest and the most important local information.

Yelp-Dataset-Analysis

Analyze the data and provide insight for the business.

Facebook Likes Estimator

Facebook Likes Estimator for Major News Publishers’ Pages.

how social media influence human emotion

An analysis of how social media influence human emotion.

MovieWatson

An intelligent movie recommendation system.

Speaker Identification

Using FFT and signal processing techniques to identify speakers.