“GUI DEiXTo” feature list

  • user friendly graphical interface – no programming required
  • enhanced, tree based, extraction rules (wrappers)
  • HTML tag filtering (sometimes, ignoring some tags makes life easier)
  • can sustain structural variations in HTML source code of the record instances
  • fast, flexible and high performance tree pattern matching algorithm
  • most of the time, 100% precision and recall can be achieved
  • automatic simple form submission
  • multi-record, multi-page, many-urls extraction modes
  • regular expression support
  • can follow “Next Page” links with adjustable crawling depth
  • can create RSS feeds from any web source
  • can export results to XML and tab delimited formats
  • can extract text, URLs and html source code
  • XML encoded wrapper project files (.wpf) – can be executed at will
  • wrapper files are compatible with DEiXTo Executor
  • command line execution to schedule extraction tasks with MS Scheduler
  • last but not least, it’s freeware!

“DEiXTo Executor” feature list

  • portable, efficient and fast command line executor of GUI DEiXTo wrappers
  • provides options and flexibility that you cannot get with GUI DEiXTo
  • supports additional output formats such as CSV, Excel and OpenDocument Spreadsheet (.ods).
  • provides database support via DBI (the Database independent interface for Perl) and a dbconfig file
  • supports HTML output using an HTML template processor and an editable template file
  • command line options can override those in wpf files
  • overwrite, append and prepend output modes for all supported formats
  • proxy support
  • can be scheduled to execute wrappers automatically (e.g. using cron in GNU/Linux)
  • can sleep random time intervals between http requests to avoid making webmasters mad..
  • it is free and open source, distributed under the GNU General Public License (GPL) Version 3!


“When I first found DEiXTo I thought it would be of limited value if useful at all. To my great surprise, DEiXTo is very easy to use, produces results that need minimal to no editing and has saved me more time than I can calculate. Thank you for providing this tool!”

Scott Sidney

Comments are closed.