Haikang Deng

Research Assistant @ UNC NLP

2023-10-17T00:00:00+00:00

Benchmarked various Learning from Human Feedback methods and studied their overoptimization problem
Introduced an efficient weighted decoding method that aligns text to a given attribute with uni-directional reward model
Explored language models’ knowledge-learning process and their QA performance relative to their pre-training data
Analyzed language model hallucination and tracked wrong answers in training corpus

Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model

2023-10-10T00:00:00+00:00

While large language models have proven effective in a huge range of downstream applications, they often generate text that is problematic or lacks a desired attribute. In this paper, we introduce Reward-Augmented Decoding (RAD), a text generation procedure that uses a small unidirectional reward model to encourage a language model to generate text that has certain properties. Specifically, RAD uses the reward model to score generations as they are produced and rescales sampling probabilities to favor high-reward tokens. By using a unidirectional reward model, RAD can cache activations from prior generation steps to decrease computational overhead. Through experiments on generating non-toxic and sentiment-controlled text, we demonstrate that RAD performs best among methods that change only the generation procedure and matches the performance of state-of-the-art methods that involve re-training the language model. We further validate that RAD is effective on very large language models while incurring a minimal computational overhead.

Large Language Models Struggle to Learn Long Tail Knowledge

2023-07-27T00:00:00+00:00

The Internet contains a wealth of knowledge – from the birthdays of historical figures to tutorials on how to code – all of which may be learned by language models. However, while certain pieces of information are ubiquitous on the web, others appear extremely rarely. In this paper, we study the relationship between the knowledge memorized by large language models and the information in pre-training datasets scraped from the web. In particular, we show that a language model’s ability to answer a fact-based question relates to how many documents associated with that question were seen during pre-training. We identify these relevant documents by entity linking pre-training datasets and counting documents that contain the same entities as a given question-answer pair. Our results demonstrate strong correlational and causal relationships between accuracy and relevant document count for numerous question answering datasets (e.g., TriviaQA), pre-training corpora (e.g., ROOTS), and model sizes (e.g., 176B parameters). Moreover, while larger models are better at learning long-tail knowledge, we estimate that today’s models must be scaled by many orders of magnitude to reach competitive QA performance on questions with little support in the pre-training data. Finally, we show that retrieval-augmentation can reduce the dependence on relevant pre-training information, presenting a promising approach for capturing the long-tail.

Software Dev Engineer Intern @ Amazon

2022-08-01T00:00:00+00:00

Built a Horizonte Service for Local Landing Page which displays local products available for pick up
Deployed the service to production and verified its reliability with production data
Onboarded downstream dependencies to fetch data and extended JSP to render user interface
Configured shopping portal page type and added routing rules from amazon.com

Software Engineer Intern @ Lenovo

2021-08-01T00:00:00+00:00

Trained Encoder-Decoder LSTM for anomaly detection on time series data.
Participated in the design of Control Chart and Anomaly Detection Module.
Performed model tuning and data grouping which improved f1 score from 0.41 to 0.48.