Abstract
In this work, we address the problem of assessing and constructing feedback for early-stage writing automatically using machine learning. Early-stage writing is typically vastly different from conventional writing due to phonetic spelling and lack of proper grammar, punctuation, spacing etc. Consequently, early-stage writing is highly non-trivial to analyze using common linguistic metrics. We propose to use sequence-to-sequence models for translating early-stage writing by students into conventional writing, which allows the translated text to be analyzed using linguistic metrics. Furthermore, we propose a novel robust likelihood to mitigate the effect of label noise in the dataset. We investigate the proposed methods using a set of numerical experiments and demonstrate that the conventional text can be predicted with high accuracy.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 5th Northern Lights Deep Learning Conference |
| Volume | 233 |
| Publisher | Proceedings of Machine Learning Research |
| Publication date | 2024 |
| Pages | 104-112 |
| Publication status | Published - 2024 |
| Event | 5th Northern Lights Deep Learning Conference - Tromsø, Norway Duration: 9 Jan 2024 → 11 Jan 2024 Conference number: 5 |
Conference
| Conference | 5th Northern Lights Deep Learning Conference |
|---|---|
| Number | 5 |
| Country/Territory | Norway |
| City | Tromsø |
| Period | 09/01/2024 → 11/01/2024 |