cic.wsd.semcor
Class Input

java.lang.Object
  extended by cic.wsd.semcor.Input

public class Input
extends java.lang.Object

Class for loading a SEMCOR XML file.

Author:
Francisco Viveros-Jiménez

Field Summary
(package private)  java.util.ArrayList<AmbiguousWord> ambiguousWords
          An ArrayList with the ambiguous words (words found in WordNet) of the SEMCOR document.
(package private)  java.util.HashMap<java.lang.String,java.util.ArrayList<java.lang.Integer>> Index
          A simple index indicating the position (appearances) of each lemma on the document.
(package private)  java.lang.String name
          This file's name.
 
Constructor Summary
Input(java.io.File file, java.util.ArrayList<Pruning> pruningList)
          Parse a SEMCOR file.
 
Method Summary
private  void calculateTF()
          Calculates the TF value for all the ambiguous words in this document.
 java.util.ArrayList<AmbiguousWord> getAmbiguousWords()
          Returns an ArrayList with all the open-class words of this document.
 java.util.HashMap<java.lang.String,java.util.ArrayList<java.lang.Integer>> getIndex()
          Returns a HashMap indicating the position (appearances) of each lemma on the document.
private  void Indexing()
          Creates the index of lemma appearances.
 java.lang.String toString()
          Returns the name of this file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

ambiguousWords

java.util.ArrayList<AmbiguousWord> ambiguousWords
An ArrayList with the ambiguous words (words found in WordNet) of the SEMCOR document.


Index

java.util.HashMap<java.lang.String,java.util.ArrayList<java.lang.Integer>> Index
A simple index indicating the position (appearances) of each lemma on the document.


name

java.lang.String name
This file's name.

Constructor Detail

Input

public Input(java.io.File file,
             java.util.ArrayList<Pruning> pruningList)
      throws java.lang.Exception
Parse a SEMCOR file.

Parameters:
file - The file to be parsed.
Throws:
java.lang.Exception
Method Detail

getAmbiguousWords

public java.util.ArrayList<AmbiguousWord> getAmbiguousWords()
Returns an ArrayList with all the open-class words of this document.

Returns:
ambiguousWords

getIndex

public java.util.HashMap<java.lang.String,java.util.ArrayList<java.lang.Integer>> getIndex()
Returns a HashMap indicating the position (appearances) of each lemma on the document.

Returns:
Index

calculateTF

private void calculateTF()
Calculates the TF value for all the ambiguous words in this document.


Indexing

private void Indexing()
Creates the index of lemma appearances.


toString

public java.lang.String toString()
Returns the name of this file.

Overrides:
toString in class java.lang.Object