Yes! We finally did it. They are very rare though – 3 pieces only!!!



Yes! We finally did it. They are very rare though – 3 pieces only!!!



The DidaxTo approach is fully presented in a paper accepted for publication by the Knowledge and Information Systems Journal (KAIS).
Pantelis Agathangelou, Ioannis Katakis, Ioannis Koutoulakis, Fotios Kokkoras and Dimitrios Gunopulos. “Learning Patterns for Discovering Domain Oriented Opinion Words“, Knowledge and Information Systems Journal (Springer), 6, pp 1-33, 2017. DOI: 10.1007/s10115-017-1072-y
DidaxTo (the application) is available for free for non-commercial use.
DidaxTo implements an unsupervised approach for discovering patterns, that will extract a domain-specific dictionary from reviews. The approach utilizes opinion modifiers, sentiment consistency theories, polarity assignment graphs and pattern similarity metrics.
More details in the DidaxTo page.
A new challenge ahead!
We are going to help a start-up to build its proof of concept, sentiment analysis application. We will provide structured data scrapped from some challenging web sources.
Web extraction techniques can provide the initial amounts of data required by data intensive apps. After a proof-of-concept application is built (and funding is probably secured) more safe data sources can be sought. Our own fuelGR family of apps was built around this scenario.
We were building a couple of wrappers recently for Jürgen (from Germany) and we fall upon a strange issue. The pound sterling character we had used in a regular expression (required for extracting prices in the UK currency) was not recognized as valid UTF-8 char by both the GUI and the CLE versions of DEiXTo.
The solution was to replace “£” with “xA3” in the regular expression. Both XML parsers (the MSXML for GUI DEiXTo and the XML parser of Perl for DEiXTo CLE) worked fine and the extraction commenced flawlessly with 100% recall.
By the way, here is a couple of useful regular expressions:
Many thanks to my students Kostas Papaioannou and Vasilis Pallas for helping me found my way.