I’m Professor of English at Université Paris Cité. My background is corpus linguistics (text and speech) and I have an increasing interest in AI, data science, machine learning and NLP
Research Areas
Learner Corpus Research
audio Large Language Models
Neural Machine translation (gender bias, XAI)
Digital Humanities
epistemology of linguistics (3rd revolution of grammatisation)
Recent projects
#Deep Learning for Language assessment
# Spectrans (specialised translation)
# Neuroviz
# PAPTAN (platform for AI)
Recent events
The sound patterns of Whisper : an informal workshop on audio LLM response to speech stimuli
Deep Learning for Language Assessment closing event
neuroViz and spectrans for Neural Machine translation
Recent publications
# Accepted
Ballier, N. (accepted) Exploring learner knowledge with Large Language Models fine-tuned with the EFCAMDAT, paper
accepted for the Learner Corpus Research conference, LCR2024, Tartu, 26-28 Sept. 2024
LREC2024 (.pdf)
# communications
Tori Thurston (Fullerton) & Nicolas Ballier (20224) Using whisper to investigate learner pronunciations of English: Comparing LLM transcriptions with human perception of VOT”, 29 March, ALOES conferece, Villetaneuse
Ballier, N. & Helen Yannakoudakis, H. (2022) Towards crowdsourcing research for learner keylogging data, LCR 2022, Padova, 22-24 sept.
Chamoun, J. & Ballier, N. 2022, Automatic Analysis of Learner Essays based on Complexity Metrics using Machine Learning Algorithms, LCR 2022, Padova, 22-24 sept.
Ballier, N. (2022) Faut-il former à ce que voit le réseau de neurones pour l’entraînement de la traduction ?, colloque Université libre de Bruxelles, Enseigner la traduction et l’interprétation à l’heure neuronale, 28-29 septembre 2022
Namdarzadeh, B. & Ballier,N. 2022 What Does Neural Machine Translation Learn ? A Snapshot from Google Translate & DeepL (2021-February 2022), colloque Université libre de Bruxelles, Enseigner la traduction et l’interprétation à l’heure neuronale, 28-29 septembre 2022. pdf
Ballier, Nicolas (2022), Traduire les dislocations de l’oral avec la traduction neuronale, Le cas des dislocations à gauche dans le CFPP
du Corpus de Français Parlé Parisien (CFPP) des années 2000, colloque TROL – Traduire l’oralité à l’ère de l’IA,
Université de Turin – 5-6 décembre 2022
# Conference papers
Namdarzadeh, B. & Ballier, N. (2022a) The Neural Machine Translation of Dislocations, Antonis Botinis (ed.) Proceedings of 13th International Conference of Experimental Linguistics (EXLING), Université Paris Cité,, 17-19 October 2022, 121-125.
Namdarzadeh, B., Ballier, N., Zhu, L., Wisniewski, G., and Yunès, J.-B. (2022b) Toward a Test Set of Dislocations in Persian for Neural Machine Translation, NSUR Proceedings, ACL
Wisniewski, G., Zhu, L., Yunès, J.-B. & Ballier, N. (2022) La robustesse de la traduction neuronale: les systèmes de traduction automatique neuronale à l’épreuve de la reproductibilité de l’expérience, Actes de la journée d’étude
sur la robustesse des systèmes de TAL,
Avec le soutien de l’ATALA et du laboratoire STIH, Caio Corrovet Gaël Lejeune (éditeurs),
25 novembre, ATALA, 29-32
Tighidet, Z. and Ballier, N. (2022) Fine-tuning a Subtle Parsing Distinction Using a Probabilistic Decision Tree: the Case of Postnominal “that” in Noun Complement Clauses vs.
Relative Clauses, ALTA2022, ACL anthology
Wisniewski, G. Zhu, L. Ballier, N. and Yvon, F. (2022) Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of an NMT System, BlackboxNLP2022, EMNLP2022,
Nicolas Ballier, Jean-Baptiste Yunès, Guillaume Wisniewski, Lichao Zhu, Maria Zimina-Poirot (2022)
The SPECTRANS System Description for the WMT22 Biomedical Task, WMT22.
Publications on the ACL anthology
DDLP (Computer science Digital Bibliography & Library Project):
Latest Open Access Publications on HAL (French Open Access Repository)
- [hal-04912112] Probing Whisper Predictions for French, English and Persian Transcriptionsby (Nicolas Ballier) on 25 January 2025 at 18h36
Whisper is a widely-used open-access Large Language Model (LLM) trained using a multilingual paradigm. As such it represents an important opportunity […]
- [hal-04878158] Explainability for NMT (with) users?by (Nicolas Ballier) on 9 January 2025 at 21h25
This presentation will first present previous attempts at visualising neural machine translation and discuss previous visualisation systems based on […]
- [hal-04878135] Exploring learner knowledge with Large Language Models fine-tuned with the EFCAMDATby (Nicolas Ballier) on 9 January 2025 at 20h59
Cette communication analyse les performances et les potentialités heuristiques d'un apprenant artificiel construit à partir d'un ré-entraînement […]
- [hal-04866716] Le projet européen MultiTraiNMT : Un premier retour sur les usages et les besoins au sein des (Caroline Rossi) on 6 January 2025 at 12h57
> check other publications on HAL
You will be redirected to French open access website HAL .