EvalLLM2026 : Atelier sur l'évaluation des modèles génératifs (LLM), le RAG et challenges

Topics

We welcome proposals for papers on all work addressing the issue of LLM evaluation.

This includes research on:

evaluation of complete systems, including RAG, or its components (e.g., retrievers, chunkers, etc.)
evaluation of foundation models, fine-tuned models, or complete systems (e.g., RAG)
the creation or adaptation of benchmarks for French or other languages of interest, whether well-endowed or poorly endowed, in general or specialized domains, or for noisy or non-standard languages (e.g., social networks, voice commands, etc.)
evaluation on NLP tasks (translation, summarization, information extraction, etc.)
the adaptation of existing evaluation methodologies to generative systems
ethical dimensions, bias, privacy, cultural or legislative alignment
performance dimensions in terms of computation time, memory, energy efficiency
evaluation with users, ergonomics, cognitive aspects
evaluation of multimodal models (e.g., text-image, text-speech, etc.).
...

The proceedings of the workshop will be published.

Format and type of articles

Several types of articles are accepted:

new contributions,
state of the art,
work in progress,
short/translated versions of articles accepted at a major conference.

The length of articles is flexible, between 4 and 10 pages, excluding references.

As reviews are conducted on a double-blind basis, submissions must be anonymized. The style sheet is that of the TALN 2026 conference. Authors are invited to comply with best practices in the field: https://aclrollingreview.org/responsibleNLPresearch/ Articles should be submitted on this page (TBC).

Call schedule

Submission: 10th of April 17th April
Feedback to authors: 30th of April
Final version: 15th of May