![[my name] CV](inc/common/severyn-ava.jpg)
Department of Information Engineering
and Computer Science,
University of Trento, Italy
Via Sommarive 5
38123 Trento (Italy)
Phone: +39 0461 285250
Email: severyn
disi.unitn.it
Hello, I'm Aliaksei Severyn. Currently my research is focused on applications of machine learning techniques in computational linguistics, in particular, large-scale training of SVMs with structural kernels, e.g. tree kernels. I'm also working on a problem of efficient structured prediction learning for complex outputs such as sequences and trees in the case when structural kernels are used. My advisor is Alessandro Moschitti.
I am currently a PhD student in Computer Science at the University of Trento, Italy. I received my first degree with honors in Radiophysics and Electronics from Belarusian State University. In the thesis I applied statistical learning algorithms, namely Support Vector Regression, to electric circuit design. Before pursuing my PhD studies I successfully completed the first year of a double degree European Master in Informatics (EuMI) program (offered by the University of Trento and RWTH Aachen University, Germany) with a GPA 28.5 out of 30. Prior to coming to the University of Trento, I worked as an Analytic Expert at ScienceSoft Inc. under the project to develop automatic trading systems for financial markets. I researched and suggested for implementation a number of state-of-the-art machine learning and data mining techniques to enhance the forecasting algorithms. During the summer period in 2009 I completed an internship at Fondazione Bruno Kessler (FBK) working on the Copilosk project under the supervision of Luciano Serafini.
-
Aliaksei Severyn, Alessandro Moschitti
Fast Support Vector Machines for Structural Kernels. [Best Student Paper Award]. Invited to DMKD journal.
ECML/PKDD (3) 2011: 175-190
[PDF] In this paper, we propose three important enhancements of the approximate cutting plane algorithm (CPA) to train Support Vector Machines with structural kernels: (i) we exploit a compact yet exact representation of cutting plane models using directed acyclic graphs to speed up both training and classification, (ii) we provide a parallel implementation, which makes the training scale almost linearly with the number of CPUs, and (iii) we propose an alternative sampling strategy to handle class-imbalanced problem and show that theoretical convergence bounds are preserved. The experimental evaluations on three diverse datasets demonstrate the soundness of our approach and the possibility to carry out fast learning and classification with structural kernels. [BibTex]@inproceedings{Severyn:2011:ECML,
author = {Severyn, Aliaksei and Moschitti, Alessandro},
title = {Fast Support Vector Machines for Structural Kernels},
booktitle = {ECML/PKDD (3)},
year = {2011},
isbn = {978-3-642-23807-9},
pages = {175-190},
keywords = {natural language processing, structural kernels, support vector machines},
} [Slides] [Video] -
Aliaksei Severyn, Alessandro Moschitti
Large-Scale Support Vector Learning with Structural Kernels.
ECML/PKDD (3) 2010: 229-244
[PDF] In this paper, we present an extensive study of the cutting-plane algorithm (CPA) applied to structural kernels for advanced text classification on large datasets. In particular, we carry out a comprehensive experimentation on two interesting natural language tasks, e.g. predicate argument extraction and question answering. Our results show that (i) CPA applied to train a non-linear model with different tree kernels fully matches the accuracy of the conventional SVM algorithm while being ten times faster; (ii) by using smaller sampling sizes to approximate subgradients in CPA we can trade off accuracy for speed, yet the optimal parameters and kernels found remain optimal for the exact SVM. These results open numerous research perspectives, e.g. in natural language processing, as they show that complex structural kernels can be efficiently used in real-world applications. For example, for the first time, we could carry out extensive tests of several tree kernels on millions of training instances. As a direct benefit, we could experiment with a variant of the partial tree kernel, which we also propose in this paper. [BibTex]@inproceedings{Severyn:2010:ECML,
author = {Severyn, Aliaksei and Moschitti, Alessandro},
title = {Large-scale support vector learning with structural kernels},
booktitle = {ECML/PKDD (3)},
year = {2010},
isbn = {978-3-642-15938-1},
pages = {229-244},
keywords = {natural language processing, structural kernels, support vector machines},
} [Slides]
More papers are coming soon...
uSVM-TK: integration of SVM-Light-TK and approximate Cutting Plane Algorithm with sampling
The use of kernels forces us to solve SVM optimization problem in the dual space, which involves O(n2) kernel computations. This fact becomes particularly important when very large datasets are used, making SVM learning prohibitively expensive. uSVM-TK overcomes this bottleneck by integrating approximate cutting plane method to train classification SVMs with structural kernels, e.g. Tree Kernels, on very large datasets. The approach employs random sampling to obtain considerable speed-ups (over a factor of 10) while delivering the same accuracy when compared to exact SVM solvers.
+ A versatile tool for application of Tree Kernels to a variety of NLP
tasks on very large-scale datasets.
+ Encodes state of the art structural kernels.
+ At least 10 times as fast as an exact version while achieving the same
classification accuracy. For example, to train a conventional SVM-light
solver with tree kernels on 1 million examples requires more than seven
days, while uSVM-TK matches the same accuracy in a few hours.
+ By decreasing the sampling size used in the approximation of the
cutting planes, we can trade off the accuracy to a small degree for even
faster training time.
+ Can be used for fast estimation of the best kernel, its
hyper-parameters, and the trade-off parameter. The identified set of
optimal parameters can be further used to train more computationally
expensive and accurate models.
[NEW!] New greatly improved training algorithm: SDAG
Download SDAG v1.0 (coming soon)
+ Faster training and classification times: speedups up to 30x
+ Takes advantage of a compact representation of a tree forest as a directed acyclic graph (see our ECML 2011 paper)
You can download uSVM-TK below from this page:
Download uSVM-TK v2.0
+ Instead of Mosek solver, uses LIBQP library as the QP solver
+ Installation process is much easier as no external libraries are required
+ Implemented new sampling strategy based on rejection sampling to handle class-imbalanced datasets (similar to -j option in SVM-light)
You can download an earlier version of uSVM-TK from the link below:
Download uSVM-TK v1.0 (no longer supported)