ll-train
from LibLinear must be included in the executable path. The executable can be found in the main directory but can be put anywhere on the system.
cv.sh
. On Windows, you'll have to call Java directly or implement a simple BAT script that does the same thing as the shell script.
cv.sh
contains experimental parameters such as the directory containing the categories (CAT_DIR
).
src
directory.
WSDLReader.java
contains the code to parse the XML and extract words. This can also be run on its own to print the words extracted from a single WSDL file: java -cp wsdl_categorizer.jar:utilities.jar:trove.jar wsdl_categorizer.WSDLReader FILE
, where FILE
is the WSDL file you want to parse.
WSDLCategorizer.java
is where the experiments are carried out.
Micro accuracy: 14 / 14 = 1.0
stock
and weather
identifiers, respectively:
0. stock 1. weather 0: 8 0 1: 0 6
0: [1.0, 1.0, 1.0, 8.0, 8.0, 8.0] 1: [1.0, 1.0, 1.0, 6.0, 6.0, 6.0]
*** For category "stock" stock 0.1777026862733373 list 0.177284332878544 symbol 0.1716715433121654 currency 0.1590836172385301 value 0.1560681923697255 last 0.1552034102078856 quote 0.1548232007012907 security 0.1492285768311836 dollar 0.1460838390059707 all 0.1458992697230641 *** For category "weather" weather 0.4236339104364145 zip 0.283701847546839 forecast 0.2218413016922719 code 0.2013819822112159 report 0.1723113339087071 city 0.166515341541315 day 0.1284451862296781 get 0.1155969040906293 u 0.1055154027396875 d 0.1026760476086771