If making extraction rules isn’t your 2nd nature, we can do it for you. Most of the time we do it for fun but if your extraction task is complex and takes much time, then we do charge a bit. For hard cases which require multiple patterns and cooperative wrappers, we could probably write customized DEiXToBot-based Perl scripts that can do the job. Tell us (with email) what exactly is the task, by providing the URL and describing the parts of the page you are interested in.

We can definitely help you to:

  • make your legacy web-site smartphone-friendly
  • quickly populate product catalogues with full specifications
  • monitor prices of the competition
  • transform the contents of a digital library into OAI-PMH or another suitable format
  • get reliable feed from stock markets
  • prepare large, focused datasets for scientific tasks (i.e. data mining)
  • perform data mining tasks for you (classification, clustering, association rules, opinion mining, summarization, etc)
  • build alerting web services that inform you when something, somewhere on the web, changes.
  • build personalized information applications for you or your clients
  • extract and summarize large volumes of text
  • <your extraction task goes here!>

I was attempting to extract from multi-part webpages and was experiencing difficulty. A quick email to Fotis and a complete explanation of the process along with worked examples arrived in my in-box….! Outstanding! A really powerful tool without bloat. Many thanks…


Getting data from many unstructured web pages, probably in a repetitive fashion with extensive copy-paste operations, is tedious and time consuming. Wouldn’t it be nice to define the content you want from a web page once and then have an application to do the laborious job for you? We, at, can help you accomplish this or even better, do it for you for free or at a small cost!

We can provide you with either the extraction rules to gather the data by yourself or with the clean, structured data. Check out some happy DEiXTo users!

Comments are closed.