Topics
We welcome proposals for papers on all work addressing the issue of LLM evaluation.
This includes research on:
- evaluation of complete systems, including RAG, or its components (e.g., retrievers, chunkers, etc.)
- evaluation of foundation models, fine-tuned models, or complete systems (e.g., RAG)
- the creation or adaptation of benchmarks for French or other languages of interest, whether well-endowed or poorly endowed, in general or specialized domains, or for noisy or non-standard languages (e.g., social networks, voice commands, etc.)
- evaluation on NLP tasks (translation, summarization, information extraction, etc.)
- the adaptation of existing evaluation methodologies to generative systems
- ethical dimensions, bias, privacy, cultural or legislative alignment
- performance dimensions in terms of computation time, memory, energy efficiency
- evaluation with users, ergonomics, cognitive aspects
- evaluation of multimodal models (e.g., text-image, text-speech, etc.).
- ...
The proceedings of the workshop will be published.
Format and type of articles
Several types of articles are accepted:
new contributions,
state of the art,
work in progress,
short/translated versions of articles accepted at a major conference.
The length of articles is flexible, between 4 and 10 pages, excluding references.
As reviews are conducted on a double-blind basis, submissions must be anonymized. The style sheet is that of the TALN 2026 conference. Authors are invited to comply with best practices in the field: https://aclrollingreview.org/responsibleNLPresearch/ Articles should be submitted on this page (TBC).
Call schedule
Submission: TBC Feedback to authors: TBC Final version: TBC