Download

DEiXTo is distributed in the hope that it will be useful. It definitely does the job but WITHOUT ANY WARRANTY. We are eager to listen to your feedback and we usually do provide support. Any questions, comments, suggestions or bug reports are welcome. Please, send us your feedback!

The latest versions of both GUI DEiXTo (MS Windows) and DEiXTo CLE (cross-platform) are available for download. Documentation is also provided.

Important Notice: Prior to deploying DEiXTo for your next extraction task, make sure that you don’t violate any access and/or copyright restrictions set by the target site. You should check their copyright statement and the robots.txt file in their root folder.

GUI DEiXTo

2014-Apr-17DEiXTo_2.9.8.5 (Windows only) is available for download! You MUST first read the freeware licence agreement before downloading or using it. This is a Windows-only application since it requires the Internet Explorer’s HTML parser and render engine.

If you have used DEiXTo before, we would appreciated your feedback in our Testimonials page. Thank you!

GUI DEiXTo – Recent Changes

  • 2.9.8.5: Minor improvements.
  • 2.9.8.4: Matching the text of the “Next Page” link in multi page crawling scenarios is now case sensitive. DEiXTo CLE was also updated on this.
  • 2.9.8.3: Saving the extracted data from the Output tab now works as expected 😉 Moreover, the ability to introduce delay between successive http calls was added (this is not saved in the wpf file though – it is an application setting).
  • 2.9.8.1: Saving the extracted data from the Output tab, now correctly takes into account the check state of the “Extract record’s native URL” checkbox, in the Project Info tab.
  • 2.9.8: Fixed an issue regarding validation of wpf files against wpf.dtd.
  • 2.9.7: After user demand, the minimum height of the application window was reduced to 600 pixels.
  • 2.9.5: Interface improvements that better utilize large monitors.
  • 2.9.4:
    • ability to suspend javascript errors in the embeded browser (on by default)
    • ability to save the scrapped data in tab delimited txt format during wrapper testing/tuning
    • unicode support in the extracted data listview during wrapper testing/tuning
    • deixto window resizes better now
    • some minor pixel level adjustments in the various GUI elements
    • improved DOM handling (those rare access violation errors were eliminated)

GUI DEiXTo Documentation

DEiXTo CLE (Command Line Executor)

DEiXTo CLE is a stand alone utility (implemented in Perl) that executes wrappers created with GUI DEiXTo. For the complete list of command line parameters supported, please check the readme.txt file in the distributed zip file. DEiXTo CLE comes into two flavours:

Both versions of DEiXTo CLE are released under the terms of the GNU General Public License (GPL) Version 3.

DEiXTo CLE – Recent Changes

  • 1.4.0:
    • added a pagenc command line parameter to force the use of a certain encoding
    • the “follow next page” mechanism is now case-sensitive
    • added a check in _getCharsetFromHeader (if $charset is undef, then it should be utf8)
  • 1.3.0: database support and a powerful post-processing mechanism was added

No installation is needed to use DEiXTo CLE – you just run it from the command line. By default, the executor complies with any robots.txt file existing on the target website. However, you can override that by setting the ‘-nice‘ command line option to 0 (zero). Respecting the webmaster’s requests and keeping out of pages that have access restrictions is up to you!

DEiXTo and DEiXTo CLE are distributed in the hope that they will be useful but without any warranty. The entire risk of using it is assumed by you.

Comments are closed.