CLoAn stands for Contrastive Loanword Annotator. It is an annotation tool to create contrastive bitext, with one side containing loanwords and the other only "native" alternatives. It is part of my Bachelor's Thesis.
To use CLoAn, clone this repository in a directory of your choice:
git clone git@github.com:chamisshe/CLoAn.gitNext, you'll want to run the included setup-scripts. This will create a virtual environment, install the required packages and create the necessary folder structure.
Mac/Linux
On a UNIX-based system, run the following commands:
cd CLoAn
source setup.shWindows
On Windows (CMD or PowerShell), run:
cd CLoAn
setup.batYou will most likely will work with the devtest split of the FLORES+ dataset. We cannot host the dataset on a public repository, therefore you will have to download the dataset from it's source repository yourself.
Important: Keep track of the path where you stored it, as you'll need to tell CLoAn when you use the tool for the first time.
CLoAn is by no means bug-free (barely any software is). Bugs encountered during development are mostly accounted for and handled, but that most likely doesn't include every possible edge case. For any bugs or inconveniences you may encounter, or suggestions for improvement, you can either open an Issue or write me an email.
