Francisco Viveros-Jiménez

   
 

Overview

CICWSD is a Java API and command for word sense disambiguation. Its main features are:

  • It has included some state-of-the-art WSD dictionary-based algorithms for you to use.
  • Easy configuration of many parameters such as window size, number of senses retrieved from the dictionary, back-off method, tie solving method and conditions for retrieving window words.
  • Easy configuration on a single XML file.
  • Output is generated in a simple XLS file by using JExcelApi.

The API is licensed under the GNU General Public License (v2 or later). Source is included. Senseval 2 and Senseval 3 English-All.Words task are bundled together with CICWSD.

Please cite the following paper in your work:

  • Viveros-Jiménez, F., Gelbukh, A., Sidorov, G.: Improving Simplified Lesk Algorithm by using simple window selection practices. Submitted.

Download

Current Version 1.0 can be dowloaded here.

A configuration tutorial for Version 1.0 can be dowloaded here.

A data interpretation tutorial for Version 1.0 can be dowloaded here.

A programming guide for Version 1.0 can be dowloaded here.

Using CICWSD command

Simply type java -jar cicwsd.jar in your command line. You must adjust config.xml file first. Instructions are inside the config.xml file.