Knowledge-Augmented Language Model Verification

Better RAG by self-verifying the process

#language-model #retrieval-augmentation

October 19, 2023

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Better RAG by self-reflecting the process

#language-model #retrieval-augmentation

October 17, 2023

Video Language Planning

Vision language models can make long horizon task plans

#language-model #multi-modal #robotics

October 16, 2023

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Contrastive ViT Makes VLM Stronger

#language-model #multi-modal

October 13, 2023

Large Language Models Are Zero-Shot Time Series Forecasters

LLMs can zero-shot forecast the future

#language-model #forecasting

October 12, 2023

Empowering Psychotherapy with Large Language Models: Cognitive Distortion Detection through Diagnosis of Thought Prompting

A step towards LLM psychotherapist

#language-model #psychiatry

October 11, 2023

Mistral 7B

Simple, intuitive tricks leverage 7B model to 13B performance

#language-model

July 13, 2023

Instruction Mining: High-Quality Instruction Data Selection for Large Language Models

Evaluating the quality of your instruction dataset

#language-model #instruction-tuning

July 10, 2023

Exploring the Potential of Large Language Models (LLMs) in Learning on Graphs

Pre-trained LLMs for graph related tasks

#language-model #graph-neural-network

May 03, 2022

OPT: Open Pre-trained Transformer Language Models

Pre-trained large language models open to public for responsible AI

#language-model #responsible-ai

January 05, 2022

SubMix: Practical Private Prediction for Large-scale Language Models

Making language models keep the secret by partitioned ensemble models watch each other

#language-model #privacy-preserving

December 21, 2021

Efficient Large Scale Language Modeling with Mixture-of-Experts

Meta is working on efficient language models with MoE too

#language-model #scaling #mixture-of-experts

December 14, 2021

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Scaling language models with less global warming

#language-model #scaling #mixture-of-experts

September 06, 2021

Finetuned Language Models Are Zero-Shot Learners

Training natural language models to learn with natural language

#language-model #zero-shot

July 08, 2021

Evaluating Large Language Models Trained on Code

GPT knows Python better than me now.

#code-generation #language-model

#image-generation #multi-modal #language-model #retrieval-augmentation #robotics #forecasting #psychiatry #instruction-tuning #diffusion-model #notice #graph-neural-network #responsible-ai #privacy-preserving #scaling #mixture-of-experts #generative-adversarial-network #speech-model #contrastive-learning #self-supervised #image-representation #image-processing #object-detection #pseudo-labeling #scene-text-detection #neural-architecture-search #data-sampling #long-tail #graph-representation #zero-shot #metric-learning #federated-learning #weight-matrix #low-rank #vision-transformer #computer-vision #normalizing-flow #invertible-neural-network #super-resolution #image-manipulation #thread-summarization #natural-language-processing #domain-adaptation #knowledge-distillation #scene-text #model-compression #semantic-segmentation #instance-segmentation #video-understanding #code-generation #graph-generation #image-translation #data-augmentation #model-pruning #signal-processing #text-generation #text-classification #music-representation #transfer-learning #link-prediction #counterfactual-learning #medical-imaging #acceleration #transformer #style-transfer #novel-view-synthesis #point-cloud #spiking-neural-network #optimization #multi-layer-perceptron #adversarial-training #visual-search #image-retrieval #negative-sampling #action-localization #weakly-supervised #data-compression #hypergraph #adversarial-attack #submodularity #active-learning #deblurring #object-tracking #pyramid-structure #loss-function #gradient-descent #generalization #bug-fix #orthogonality #explainability #saliency-mapping #information-theory #question-answering #knowledge-graph #robustness #limited-data #recommender-system #anomaly-detection #gaussian-discriminant-analysis #molecular-graph #video-processing