Journal Paper about DidaxTo

The DidaxTo approach is fully presented in a paper accepted for publication by the  Knowledge and Information Systems Journal (KAIS).

Pantelis AgathangelouIoannis Katakis, Ioannis Koutoulakis, Fotios Kokkoras and Dimitrios Gunopulos. “Learning Patterns for Discovering Domain Oriented Opinion Words“, Knowledge and Information Systems Journal (Springer), 6, pp 1-33, 2017. DOI: 10.1007/s10115-017-1072-y

DidaxTo (the application) is available for free for non-commercial use.

Posted in News | Tagged , , , , , , , | Leave a comment

DidaxTo is available for experimentation.

didaxtoDidaxTo implements an unsupervised approach for discovering patterns, that will extract a domain-specific dictionary from reviews. The approach utilizes opinion modifiers, sentiment consistency theories, polarity assignment graphs and pattern similarity metrics.

More details in the DidaxTo page.

Posted in News | Leave a comment

DEiXTo will power a Sentiment Analysis start-up.

extraction / web srapingA new challenge ahead!

We are going to help a start-up to build its proof of concept, sentiment analysis application. We will provide structured data scrapped from some challenging web sources.

Web extraction techniques can provide the initial amounts of data required by data intensive apps. After a proof-of-concept application is built (and funding is probably secured) more safe data sources can be sought. Our own fuelGR family of apps was built around this scenario.

Posted in News | Leave a comment

Pound sterling char in DEiXTo regular expressions

pound-sterlingWe were building a couple of wrappers recently for Jürgen (from Germany) and we fall upon a strange issue. The pound sterling character we had used in a regular expression (required for extracting prices in the UK currency) was not recognized as valid UTF-8 char by both the GUI and the CLE versions of DEiXTo.

The solution was to replace “£” with “xA3” in the regular expression. Both XML parsers (the MSXML for GUI DEiXTo and the XML parser of Perl for DEiXTo CLE) worked fine and the extraction commenced flawlessly with 100% recall.

By the way, here is a couple of useful regular expressions:

  • to get dates in xx/xx/xxxx format with either one or two digits for day and month use: “(\d{1,2}\/\d{1,2}\/\d{4})” (without the double quotes)
  • to get £ price data from 0.01 to 999,999,999.99 in this format, use: “\xA3(\d*,\d*,?\d*\.?\d*)” (without the double quotes)

Many thanks to my students Kostas Papaioannou and Vasilis Pallas for helping me found my way.

Posted in News | Tagged , , , , , | Leave a comment

DEiXTo on facebook

https://www.facebook.com/deixto/

facebook

 

Posted in News | Leave a comment