A one day workshop at Université de Paris, 30 Oct 2019
organised by Manon Bouyé and Nicolas Ballier
8 Place Paul Ricœur, 75013 Paris
Olympe de Gouges Building, room 115, first floor
Funding
With the financial support of the French Ministry for Europe and Foreign Affairs (Ministères de l’Europe et des affaires étrangères, MEAE) and the French Ministry of Higher Education, Research and Innovation (Ministère de l’Enseignement supérieur, de la Recherche et de l’Innovation, MESRI).
Program
MORNING: discussing our results
9h00 Opening Nicolas Ballier The Ulysse PHC project : aims, data and limitations
9h20 Thomas Gaillat Investigating learner micro-systems and customizing CEFR criterial features : the micro-system feature set and its regex syntax.
9h40 Discussion
10h30 Bernardo Stearns and Annanda Sousa : The user interface prototype demo
We hope to deliver a docker and a github version of our user interface that allows you to paste a text, have a coffee while the text is processed and then get the probability of the text of being of a given CEFR level.
10h45 Discussion
11h15 Bernardo Stearns for Andrew Simpkins : Classifying learner level
Overfitting ? comparison with a graded corpus
As a preliminary step, we have tested our current User Interface with the CEFR ASAG corpus to check whether our model is biased to the A1 level.
11h30 General discussion
12h15 LUNCH BREAK (poster session at Diderot)
Posters displayed at Diderot and on a shared google drive for distant participants.
Thomas Gaillat et al. (Rennes) : Vizualisations of linguistic profiles in learner written productions
Elena Volodina (Gothenburg) : Overview over text-based CEFR research for L2 Swedish: on the intersection between NLP, L2 corpora and CALL
Arnold et al. poster (paper presented at the Cap2018 conference). A paper, adding syntactic complexity metrics to the CAp2018 dataset, was also accepted for this French conference of Machine Learning:
Arnold, T., Ballier, N, Gaillat, T. & Lissón, P., 2018 , Predicting CEFRL levels in learners of English on the basis of metrics and full texts, CAp2018 conference. Université de Rouen. 19-21 juin 2018. Paper 31 in the proceedings of the conference. Arxiv
AFTERNOON: Learner corpora and beyond: collecting and interpreting learning process and product data
A blueprint was circulated pointing out potential future directions.
14h STRAND 1 Adding more metrics/NLP-based methods for error detection / problematic areas for learners
15h STRAND 2 Exploring the relation between Learner corpus annotation, language testing, and individual feedback to learners
16h30 Сoffee break
17h STRAND3 Should we try to link learner corpus and learning analytics research – and what is there to be gained? Ideas for Tracking Development path ? (Fuchs, Götz & Werner 2016) How to develop learner profiles based on student input?
1815 Closing remarks and future plans
1830 End of the workshop
Call for participation
As a closing event of a European-funded project, we invite colleagues to share their ideas about the automatic analysis of learner corpora and how they can be applied towards interlanguage analysis, CEFR level prediction, and error detection – and extended to support individual feedback to learners and learning analytics.
The morning session will present some of the results of this French-Irish project “PHC Ulysse 2019”: the features of the EFCAMDAT corpus we used as the first step for our experiments, the methodology we developed, and our main findings. We will present our prototype of user interface for automatic detection of CEFR levels and discuss aspects such as overfitting of a model based on the French and Spanish components of EFCAMDAT. We will also discuss the shared task we held on a portion of this
We will discuss posters over coffee breaks recapitulating some of the issues.
Admission is free but registration is compulsory (on a first come, first served basis) on Framapad.
The summary of the Ulysse PHC Project can be found here.
Discussants
Discussants at Diderot
Taylor Arnold (University of Richmond) is Assistant Professor of Statistics at the University of Richmond and has a strong interest in NLP as a data scientist and digital humanist, see Arxiv.
Detmar Meurers (University of Tübingen) is Professor of Computational Linguistics and head of the research group on Intelligent Computer-Assisted Language Learning there.
Discussants (videoconference)
Mick O’Donnell (Universidad Autónoma de Madrid, Departamento de Filología Española). See the WricLE corpus, the TREACLE Project and the Adaptive Learning of English Grammar Online.
Elena Volodina (Gothenburg). See the SweLL project – research infrastructure for Swedish as a second language.
Olga Vinogradova (Moscow, National Research University Higher School of Economics). See the Realec project, Russian Error-Annotated Learner English Corpus.
See the 59 features : the link and short description attached.
Contact person
À lire aussi
JE – « Corpus d’apprenants / corpus d’experts : Quels enseignements pour la caractérisation du discours scientifique? »
Organisée dans le cadre du projet CarDiBiomed. PROGRAMME Salle 720, Bâtiment Olympe de Gouges 10h00-10h30 Accueil (café) 10h20-10h30 Natalie Kubler Université Paris Cité Ouverture de la J.E /Directrice CLILLAC-ARP 10h30-11h05 Magali Paquot ...
DLLA Closing event
30 November - 1st of December Deep learning for language assessment closing event rooms 715 (Th morning) and 720 Bâtiment Olympe de Gouges 8 Place Paul Ricoeur 75013 PARIS Accès au bâtiment Olympe de Gouges We take the opportunity of this closing event to present and...
ALOES 2024 Pre-conference Workshop
ALOES 2024 pre-conference workshop Pre-conference Workshop on Internet Spoken Corpora of English Thursday 28 March l 2024 Programme 14h 00 Opening session 1. Youtube scraping: three methods 14h 15 Adrien Méli the PEASYV pipeline 14h 45 Peter...
Rencontres des jeunes traductologues 2023
Traduction et interprétation : entre théorie et pratique 4 mai 2023, de 9h30 à 18h Bâtiment Olympe de Gouges, salle 720 Comité d'organisation : Maud Bénard, Marie Bouchet, Anastasia Buturlakina (Université Paris Cité); Bérengère Denizeau, Valentine Pieplu, Sara Salmi...