Princeton University users: to view a senior thesis while away from campus, connect to the campus network via the Global Protect virtual private network (VPN). Unaffiliated researchers: please note that requests for copies are handled manually by staff and require time to process.
 

Publication:

Handwritten Chinese Error Correction for Learners of Chinese as a Foreign Language

datacite.rightsrestricted
dc.contributor.advisorFong, Ruth Catherine
dc.contributor.authorChan, Emilio
dc.date.accessioned2025-08-06T15:48:41Z
dc.date.available2025-08-06T15:48:41Z
dc.date.issued2025-04-10
dc.description.abstractHandwritten Chinese character error correction (HCCEC) is the process by which machine-learning models assess an image of a handwritten Chinese character, determine whether or not it is written incorrectly, and if it written incorrectly, output the character that the writer intended to write. HCCEC has gained more attention in recent years, but so far no work has been done to assess or create models targeted towards learners of Chinese as a foreign language (CFL learners). CFL learners stand to gain a great deal from HCCEC. An effective HCCEC model would be an effective educational tool to help CFL learners learn and practice handwriting Chinese characters. As part of this work, a dataset containing handwritten Chinese characters produced by CFL learners was created that contains both correctly written Chinese characters and incorrectly written Chinese characters. Next, an existing HCCEC model called the Tree-structure Analysis Network (TAN) is trained on a large dataset containing characters written by middle school students in China and then evaluated on test sets of the CFL learner dataset (Li et al. 2023, Li et al. 2023). Finally, TAN is fine-tuned using the training and validation sets of the CFL learner dataset and re-evaluated on the test sets. While performance on key evaluation metrics does not reach that of previous work on different datasets, this work does show that fine-tuning HCCEC models using data produced by CFL learners can improve all key metrics when evaluating the model on characters written by CFL learners (Hu et al. 2023, Li et al. 2023). It is my hope that this work can be the first of many exploring the potential of HCCEC applied to characters written by learners of Chinese as a foreign language.
dc.identifier.urihttps://theses-dissertations.princeton.edu/handle/88435/dsp019k41zh96w
dc.language.isoen_US
dc.titleHandwritten Chinese Error Correction for Learners of Chinese as a Foreign Language
dc.typePrinceton University Senior Theses
dspace.entity.typePublication
dspace.workflow.startDateTime2025-04-15T19:47:51.450Z
pu.contributor.authorid920277968
pu.date.classyear2025
pu.departmentComputer Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
written_final_report.pdf
Size:
8.68 MB
Format:
Adobe Portable Document Format
Download

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
100 B
Format:
Item-specific license agreed to upon submission
Description:
Download