Bio
I’m Professor of English at Université Paris Cité. My background is corpus linguistics (text and speech) and I have an increasing interest in AI, data science, machine learning and NLP
Research Areas
Learner Corpus Research
audio Large Language Models
Neural Machine translation (gender bias, XAI)
Digital Humanities
epistemology of linguistics (3rd revolution of grammatisation)
Recent projects
#Deep Learning for Language assessment
# Spectrans (specialised translation)
# Neuroviz
# PAPTAN (platform for AI)
Recent events
The sound patterns of Whisper : an informal workshop on audio LLM response to speech stimuli
Deep Learning for Language Assessment closing event
neuroViz and spectrans for Neural Machine translation
Recent publications
# Accepted
Ballier, N. (accepted) Exploring learner knowledge with Large Language Models fine-tuned with the EFCAMDAT, paper
accepted for the Learner Corpus Research conference, LCR2024, Tartu, 26-28 Sept. 2024
LREC2024 (.pdf)
# communications
Tori Thurston (Fullerton) & Nicolas Ballier (20224) Using whisper to investigate learner pronunciations of English: Comparing LLM transcriptions with human perception of VOT”, 29 March, ALOES conferece, Villetaneuse
https://eclla.univ-st-etienne.fr/fr/tout-l-agenda/annee-2023-2024/eclla-2024-21e-colloque-aloes.html
Ballier, N. & Helen Yannakoudakis, H. (2022) Towards crowdsourcing research for learner keylogging data, LCR 2022, Padova, 22-24 sept.
Chamoun, J. & Ballier, N. 2022, Automatic Analysis of Learner Essays based on Complexity Metrics using Machine Learning Algorithms, LCR 2022, Padova, 22-24 sept.
Ballier, N. (2022) Faut-il former à ce que voit le réseau de neurones pour l’entraînement de la traduction ?, colloque Université libre de Bruxelles, Enseigner la traduction et l’interprétation à l’heure neuronale, 28-29 septembre 2022
Namdarzadeh, B. & Ballier,N. 2022 What Does Neural Machine Translation Learn ? A Snapshot from Google Translate & DeepL (2021-February 2022), colloque Université libre de Bruxelles, Enseigner la traduction et l’interprétation à l’heure neuronale, 28-29 septembre 2022. https://tradital.ltc.ulb.be/medias/fichier/2022-colloque-tradital-programme-online_1660741236130- pdf
Ballier, Nicolas (2022), Traduire les dislocations de l’oral avec la traduction neuronale, Le cas des dislocations à gauche dans le CFPP
du Corpus de Français Parlé Parisien (CFPP) des années 2000, colloque TROL – Traduire l’oralité à l’ère de l’IA,
Université de Turin – 5-6 décembre 2022
# Conference papers
Namdarzadeh, B. & Ballier, N. (2022a) The Neural Machine Translation of Dislocations, Antonis Botinis (ed.) Proceedings of 13th International Conference of Experimental Linguistics (EXLING), Université Paris Cité,, 17-19 October 2022, 121-125.
Namdarzadeh, B., Ballier, N., Zhu, L., Wisniewski, G., and Yunès, J.-B. (2022b) Toward a Test Set of Dislocations in Persian for Neural Machine Translation, NSUR Proceedings, ACL
Wisniewski, G., Zhu, L., Yunès, J.-B. & Ballier, N. (2022) La robustesse de la traduction neuronale: les systèmes de traduction automatique neuronale à l’épreuve de la reproductibilité de l’expérience, Actes de la journée d’étude
sur la robustesse des systèmes de TAL,
Avec le soutien de l’ATALA et du laboratoire STIH, Caio Corrovet Gaël Lejeune (éditeurs),
25 novembre, ATALA, 29-32
https://www.atala.org/sites/default/files/robustal2022.pdf
Tighidet, Z. and Ballier, N. (2022) Fine-tuning a Subtle Parsing Distinction Using a Probabilistic Decision Tree: the Case of Postnominal “that” in Noun Complement Clauses vs.
Relative Clauses, ALTA2022, ACL anthology
Wisniewski, G. Zhu, L. Ballier, N. and Yvon, F. (2022) Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of an NMT System, BlackboxNLP2022, EMNLP2022,
https://preview.aclanthology.org/emnlp-22-ingestion/2022.blackboxnlp-1.13/
Nicolas Ballier, Jean-Baptiste Yunès, Guillaume Wisniewski, Lichao Zhu, Maria Zimina-Poirot (2022)
The SPECTRANS System Description for the WMT22 Biomedical Task, WMT22. https://aclanthology.org/2022.wmt-1.82/
Publications on the ACL anthology
DDLP (Computer science Digital Bibliography & Library Project):
Latest Open Access Publications on HAL (French Open Access Repository)
- [hal-04781585] Enhancing Translation Quality: A Comparative Study of Fine-Tuning and Prompt Engineering in Dialog-Oriented...by ano.nymous@ccsd.cnrs.fr.invalid (Lichao Zhu) on 16 November 2024 at 16h43
<div><p>For this shared task, we have used several machine translation engines to produce translations (en ⇔ fr) by fine-tuning a […]
- [hal-04781595] Enhancing Translation Quality: A Comparative Study of Fine-Tuning and Prompt Engineering in Dialog-Oriented...by ano.nymous@ccsd.cnrs.fr.invalid (Lichao Zhu) on 13 November 2024 at 19h04
[...]
- [hal-04581509] Context-Aware Neural Machine Translation Models Analysis And Evaluation Through Attentionby ano.nymous@ccsd.cnrs.fr.invalid (Marco Dinarelli) on 10 October 2024 at 18h38
Model explainability has recently become an active research field.Many works are published supporting or criticizing attention weights as model […]
- [hal-04712737] Linguistic interoperability within a unified architectureby ano.nymous@ccsd.cnrs.fr.invalid (Thomas Gaillat) on 27 September 2024 at 18h11
Modern approaches to quantitative linguistics rely on large datasets. These datasets are representations of linguistic observations made up of […]
> check other publications on HAL
You will be redirected to French open access website HAL .