The UMIACS Word Alignment Interface
© Nitin Madnani & Rebecca Hwa, 2004
Introduction:
This is an improved version of the alignment interface that Rebecca Hwa had written when she was here at UMIACS.
I liked the appearence of the interface a lot but I wanted to incorporate some of the functionality that Blinker
(another alignment interface written by Dan Melamed) provided. So, I rewrote some of the underlying code to get us
that functionality.Along the way, I also made some cosmetic changes to make the interface look and behave better
(e.g., dialog boxes etc).
Download:
1) Get the binary:
Download align.jar
OR
If you are on a clip-machine, copy over /fs/clip-duster/DUSTer/AlignUI/align.jar
2) Get the data:
The interface can be used with any data that you have as long as they are in the proper encodings.
The Chinese data needs to be encoded as GB2312.
For testing purposes, you can use the following data files:
Download the following two files and put them in the same directory as the interface binary above:
English Data
Chinese Data
Prepare your system:
-
Windows 2000:
IMPORTANT: You will need an administrator level account on your machine and the Windows 2000 CD to get this working.
1) Make sure you have the latest version of the Java SDK (which is presently version 1.4.2_05)
2) Go to Start -> Settings -> Control Panel .
3) Double click on "Regional Options" .
4) Click on the General tab .
5) Under the heading "Language Setting for the System", make sure "Simplified Chinese" and
"Traditional Chinese" are checked. If not, select them.
6) Select the "Input Locales", click on the Add button and select "Chinese PRC" from the
Input Locales drop-down menu in the dialog box that comes up.
7) Click on Apply .
8) If you did not make any selections or changes in the steps above, then skip to step 12
otherwise continue to step 9 .
9) If you made changes to the configuration, you will be prompted for the Windows 2000 CD.
10) Insert the disc and follow the instructions. You will need to restart the computer.
Once restarted, continue following the instructions below.
11) Return to "Regional Options" in the Control Panel.
12) Select the General tab.
13) Select "Chinese (PRC)" from the drop-down menu called "Your locale (location)".
14) Click OK.
IMPORTANT: You MUST return to your original locale after using the interface otherwise you will
encounter problems during normal usage of your computer. To do this, repeat steps 11-14
but select your original locale [usually "English (United States") ] from the drop-down menu.
-
Windows XP:
IMPORTANT: You will need an administrator level account on your machine and the Windows XP CD to get this working.
1) Make sure you have the latest Java version installed (presently version 1.4.2_05)
2) Go to Start -> Control Panel
3) Double click "Regional & Language Options". (It may also be called
"Date, Time, Language & Regional Options" on your machine)
4) Click on the Languages tab
5) Under "Supplemental Language Support", make sure that all options are checked,
especially "Install files for East Asian Languages". If they are not, select them.
6) Click on the Advanced tab .
7) Under "Code Page Conversion Tables", make sure the following are selected:
10002 (MAC - Traditional Chinese Big5)
10008 (MAC - Simplified Chinese GB 2312)
20936 (Simplified Chinese GB2312)
54936 (GB18030 Simplified Chinese)
8) Click on Apply.
9) If you did not have make any selections or changes in the above steps,
skip directly to step 13 otherwise continue to step 10.
10) If you made changes and selections in the above steps, you will be prompted for
the Windows XP CD. Insert and follow the instructions that appear on the screen.
11) You may also have to restart the computer. Once restarted, continue from below.
12) Return to "Regional & Language Options" in the Control Panel
13) Click on the Advanced tab.
14) Under "Language for non-Unicode programs", select "Chinese (PRC)" from the drop down menu.
15) Click OK.
16) You will be prompted to restart your computer.
17) Once restarted, start a command prompt (Start -> Programs -> Accessories -> Command Prompt).
18) Change directory to the interface directory.
17) Run the interface as follows:
java -jar align.jar test.eng test.ch GB2312
Run:
-
Windows XP/2000:
1) Open a commmand prompt. (Start -> Programs -> Accessories -> Command Prompt)
2) Change to the interface directory. Run the interface as follows:
java -jar align.jar test.eng test.ch GB2312
-
Mac OS X:
Just open a terminal, 'cd' to the interface directory, and run:
java -jar align.jar test.eng test.ch GB2312
It just works !
Usage:
-
Create a link:
1) Click on one of the words that you want to link (either Chinese or English)
2) Click on the word in the other language.
3) Link is created automatically.
The following animated gif shows a demo on how to create a link.
Creating a link
-
Delete a link:
1) Click on any word in the link.
2) Click on the other word.
3) The link should now be red, indicating that it is selected.
4) Click on the "Delete Link" button to delete the link.
The following image shows how to delete a link:
Deleting a link
De-selecting a selected link
1) Once you have selected a link as outlined above in "Delete a link", you can de-select it it if you change your
mind about deleting that link. To do this, just click on the "De-select Link" button.
The following image demonstrates this:
De-selecting a selected link
Important Notes:
-
If you are not finished with a sentence but would still like to proceed with the next sentence, click the "Next Sentence" button. You will see a warning:
Warning when skipping an incompletely aligned sentence
Click on "Yes" to proceed to the next sentence. You can return to any earlier sentence at any time by using the "Prev Sentence" button.
-
When you skip a sentence that is not completely aligned, the interface records its number on disk. This is done to remind you to finish aligning those sentences. Reminders will be seen when you exit the interface:
Warning about sentences left unaligned when exiting the interface
You can see which sentence you left unfinished by clicking the "Which Ones?" button:
The sentences that were left unfinished
A similar warning will be seen when you start the interface after a session in which you had unfinished sentences:
Warning about sentences that were left unaligned, when starting the interface
Help/Feedback:
If you have any questions, technical difficulties and comments, feel free to bug me at nmadnani at umiacs dot umd dot edu.
Thanks:
In no particular order:
- Bonnie Dorr
- Christof Monz
- Necip Fazil Ayan
- Nizar Habash
- Okan Kolak
- and, of course, Rebecca Hwa !!