Mengjiao Zhang

About me

I'm an Applied Scientist at Microsoft Turing team.

Research Interests

Large Language Model
- Parameter Efficient Finetune
- Question Answering
- Reinforcement Learning
Natural Language Processing
- Multilingual Machine Translation
- Efficient Token Representation
- Privacy Preservation
Federated Learning
- Private Federated Learning
Reinforcement Learning
- Multi-agent Reinforcement Learning

My skills

Programming languages

Python

Java

MATLAB

CSS

HTML

Deep Learning

PyTorch

TensorFlow

Numpy

Pandas

Hugging Face

Tools

Git

Linux

LaTeX

AWS

Education

Stevens Institute of Technology

Hoboken, NJ, USA

August, 2019 — May 2024
- Ph.D. in Computer Science
- Reseach areas: Large Language Model, Natural Language Processing, Federated Learning
Xidian University

Xi'an, Shaanxi, China

September, 2016 — June 2019
- M.S. in Signal and Information Processing
- Reseach areas: Bayesian Models, Deep learning
Xidian University

Xi'an, Shaanxi, China

August, 2012 — June 2016
- B.S. in Electronic Engineering (Education reform class)

Work

Microsoft

Redmond, WA, USA

July 2024 — Now

Applied Scientist
Adobe

San Jose, CA, USA

May 2023 — December 2023

Research Scientist/Engineer Intern
Stevens Institute of Technology

Hoboken, NJ, USA

Teaching Assistant
- Course: CS560 Statistical Machine Learning (Graduate)
  
  January 2024 — May 2024
- Course: CS541 Artificial intelligence (Graduate)
  
  September 2023 - December 2023
- Course: CS583 Deep Learning (Graduate)
  
  January 2023 - May 2023
- Course: CS583 Deep Learning (Graduate)
  
  September 2022 - December 2022
- Course: CS284D Data Structure (Undergraduate)
  
  January 2022 — May 2022

Projects

Adobe

San Jose, CA, USA

May 2023 — December 2023

Research Scientist/Engineer Intern
- Project: Finetune and evaluate Large Language Models (LLMs) on domain-specific Question Answering (QA) data.
- Main skills: Python/PyTorch/LLMs/Git/Question Answering/Pandas/Bash
- Experiments: Fine-tune Falcon 7B and Falcon 40B models for QA; Employ robust evaluation methodologies, utilizing industry-standard metrics such as ROUGE and METEOR for automatic assessment. Develop an innovative LLM-based evaluation metric, assessing both semantic similarity and correctness with the LLMs.
- Results: Demonstrate an approximate enhancement of 0.1-0.2 across key automatic evaluation metrics including ROUGE-1, ROUGE-L, and METEOR. Obtain an increase of ~1.0 (on a scale of 1-5) in terms of both similarity and correctness, as measured by the GPT-4 evaluator.
Stevens Institute of Technology

Hoboken, NJ, USA

September, 2022 — December 2023

Research Assistant
- Project: Private NLP Model in Federated Learning
- Task: Protect users’ privacy against attacks based on embedding gradients with bytes.
- Experiments: NLP tasks including machine translation, sentiment analysis, and language modeling tasks.
- Results: Increased approximately 1.0 BLEU point on translation and up to 1.3 accuracy on sentiment analysis over the baseline model. Defended against attacks based on subword inference from the gradients while maintaining model performance and efficiency.
Stevens Institute of Technology

Hoboken, NJ, USA

December 2021 — August 2022

Research Assistant
- Project: Byte-based Multilingual Machine Translation
- Task: Improve multilingual machine translation based on byte tokenization
- Experiments: NLP tasks including machine translation, sentiment analysis, and language modeling tasks.
- Results: Increased up to 18.5 BLEU points of translation on low-resource and endangered languages. Enhanced the generalizability and robustness of the byte-based model with our proposed random byte encoding with ensemble prediction.
Stevens Institute of Technology

Hoboken, NJ, USA

September 2019 — November 2021

Research Assistant
- Project: Privacy & Security in Federated Learning
- Task: Propose a defense method for secure collaborative learning with matrix sketching.
- Experiments: Classification of MNIST and CIFAR-10 under federated learning settings and input recovery with and without our defense.
- Results: Proved the effectiveness of our defense theoretically and experimentally. Protected user privacy without compromising model performance. The per-communication round complexity is reduced to 0.5x. The L-2 Norm of gradient matching loss with our defense method increases from 0 to 25 – 100, making the attacks much more difficult.

Research Interests

Large Language Model

Natural Language Processing

Federated Learning

Reinforcement Learning

My skills