ll-train from LibLinear must be included in the executable path. The executable can be found in the main directory but can be put anywhere on the system.
cv.sh. On Windows, you'll have to call Java directly or implement a simple BAT script that does the same thing as the shell script.
cv.sh contains experimental parameters such as the directory containing the categories (CAT_DIR).
src directory.
WSDLReader.java contains the code to parse the XML and extract words. This can also be run on its own to print the words extracted from a single WSDL file: java -cp wsdl_categorizer.jar:utilities.jar:trove.jar wsdl_categorizer.WSDLReader FILE, where FILE is the WSDL file you want to parse.
WSDLCategorizer.java is where the experiments are carried out.
Micro accuracy: 14 / 14 = 1.0
stock and weather identifiers, respectively:
0. stock 1. weather 0: 8 0 1: 0 6
0: [1.0, 1.0, 1.0, 8.0, 8.0, 8.0] 1: [1.0, 1.0, 1.0, 6.0, 6.0, 6.0]
*** For category "stock" stock 0.1777026862733373 list 0.177284332878544 symbol 0.1716715433121654 currency 0.1590836172385301 value 0.1560681923697255 last 0.1552034102078856 quote 0.1548232007012907 security 0.1492285768311836 dollar 0.1460838390059707 all 0.1458992697230641 *** For category "weather" weather 0.4236339104364145 zip 0.283701847546839 forecast 0.2218413016922719 code 0.2013819822112159 report 0.1723113339087071 city 0.166515341541315 day 0.1284451862296781 get 0.1155969040906293 u 0.1055154027396875 d 0.1026760476086771