Following papers are accepted to NeurIPS 2024:
- How Do Large Language Models Acquire Factual Knowledge During Pretraining?
- Aligning to Thousands of Preferences via System Message Generalization
2024
Following papers are accepted to EMNLP 2024:
- Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
- Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Utilization
- Exploring the Practicality of Generative Retrieval on Dynamic Corpora
- On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning
- Rethinking the Role of Proxy Rewards in Language Model Alignment
- Instruction Matters, a Simple yet Effective Task Selection Approach in Instruction Tuning for Specific Tasks
The following paper is accepted to EMNLP 2024 Findings:
- Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards
Following papers are accepted to ACL 2024:
- Semiparametric Token-Sequence Co-Supervision
- LangBridge: Multilingual Reasoning Without Multilingual Supervision
- Aligning Large Language Models by On-Policy Self-Judgment
- Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
- ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval
The following paper is accepted to ACL 2024 Findings:
- Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation
Following papers are accepted to NAACL 2024:
- REPLUG: Retrieval-Augmented Black-Box Language Models
- Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision
- KTRL+F: Knowledge-Augmented In-Document Search
- How Well Do Large Language Models Truly Ground?
- Carpe diem: On the Evaluation of World Knowledge in Lifelong Language Models
We are welcoming Jinho Park (MS), Juyoung Suk (MS), Hyeonbin Hwang (MS), Seongyun Lee (MS). We are also welcoming MS->PhD conversion of Hoyeon Chang.
Hanseok Oh (MS) has graduated.
Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis by Sohee Yang et al. is accepted to TACL 2024. [code]
Following papers are accepted to ICLR 2024:
- (Spotlight) FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
- Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
- SuRe: Improving Open-domain Question Answering of LLMs via Summarized Retrieval
2023
Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following by Seonghyeon Ye et al. is accepted to AAAI 2024. [code]
Glad to share that Seonghyeon Ye has received the Qualcomm Innovation Fellowship Korea 2023 Award! [link]
Following papers are accepted to EMNLP 2023:
- The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
- Aligning Large Language Models through Synthetic Feedback
- Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models
- Cream : Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Following papers are accepted to EMNLP 2023 Findings:
- Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt
The following paper is accepted to EMNLP 2023 Industry Track:
- An Integrated Search System for Korea Weather Data
A Bayesian Perspective On Training Data Attribution by Elisa Nguyen et al. is accepted to NeurIPS 2023. [code]
We are welcoming Geewook Kim (PhD), Dongkeun Yoon (MS+PhD) and Suehyun Park(MS).
Joel Jang (MS), Soyoung Yoon (MS), Yongrae Jo (MS), and Eunbi Choi (MS) have graduated.
Following papers are accepted to ACL 2023:
- Knowledge Unlearning for Mitigating Privacy Risks in Language Models
- Towards standardizing Korean Grammatical Error Correction: Datasets and Annotation
- Gradient Ascent Post-training Enhances Language Model Generalization
Following papers are accepted to ACL 2023 Findings:
- Nonparametric Decoding for Generative Retrieval
- Fixed Input Parameterization for Efficient Prompting
- Two Examples are Better than One: Context Regularization for Gradient-based Prompt Tuning
- Comparing and Contrasting Claims on Contentious Issues
Exploring the Benefits of Training Expert Language Models over Instruction Tuning by Joel Jang et al. is accepted to ICML 2023. [code]
We are welcoming Doyoung Kim (MS), Seungone Kim (MS), and Jiyeon Kim (MS). We are also welcoming MS->MS+PhD conversion of Seonghyeon Ye and MS->PhD conversion of Hyunji Lee.
Sohee is joining UCL / DeepMind as a PhD Student / Research Scientist Intern.
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners by Seonghyeon Ye et al. is accepted to ICLR 2023. [code] [demo]
2022
Glad to share that Joel Jang has received the Qualcomm Innovation Fellowship Korea 2022 Award! [link]
Minjoon will give a talk at Samsung AI Forum 2022 on the topic of Generative Retrieval. [news]
Following paper is accepted to NeurIPS 2022 Workshop on Transfer Learning for NLP:
- Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts
Following papers are accepted to EMNLP 2022:
- TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
- Generative Multi-hop Retrieval
- Saving Dense Retriever from Shortcut Dependency in Conversational Search
- Generating Information-Seeking Conversations from Unlabeled Documents
Following paper is accepted to EMNLP 2022 Findings:
- Keep Me Updated! Memory Management in Long-term Conversations
Following papers are accepted to NeurIPS 2022 Datasets and Benchmark:
- A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction
- EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
Please see here for instructions. The deadline is 2022.09.12.
We are teaching a new AI+X course “AI for Law” in Fall 2022 semester. It is featured at KAIST News. [link]
We are welcoming Hoyeon Chang (MS+PhD), Sungdong Kim (MS+PhD), and Hyowon Cho (MS).
We are welcoming Haebin Shin (MS, Samsung Research) and Seonghyeon Ye (MS). We are also welcoming MS → MS+PhD conversion of Joel Jang.
Towards Continual Knowledge Learning of Language Models by Joel Jang et al. is accepted to ICLR 2022.
2021
KAIST LK Lab (Hanseok Oh and Minjoon Seo) and Twelve Labs (Aiden Lee) have won VALUE Challenge Retrieval Track at ICCV 2021. The results are published at ICCV21 CLVL Workshop: ViSeRet: A simple yet effective approach to moment retrieval via fine-grained video segmentation.
We are welcoming Hanseok Oh (MS) and Yongrae Jo (MS)!
Label Embedding for Chinese Grapheme-to-Phoneme Conversion by Eunbi Choi et al. is accepted to Interspeech 2021.
Following papers are accepted to ACL 2021 Findings:
- Spatial Dependency Parsing for Semi-Structured Document Information Extraction by Hwang et al. (Sohee Yang and Minjoon Seo)
- SSMix: Saliency-based Span Mixup for Text Classification by Soyoung Yoon et al.
Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering by Sohee Yang and Minjoon Seo is accepted to NAACL 2021 as a short paper.
We are welcoming seven starting members of the lab: Sohee Yang (MS+PhD), Miyoung Ko (PhD), Soyoung Yoon (MS), Joel Jang (MS), Jinkyung Jo (PhD), Eunbi Choi (MS), and Hyunji Lee (MS).