DEiXTo is distributed in the hope that it will be useful. It definitely does the job but WITHOUT ANY WARRANTY. We are eager to listen to your feedback and we usually do provide support. Any questions, comments, suggestions or bug reports are welcome. Please, send us your feedback!
The latest versions of both GUI DEiXTo (MS Windows) and DEiXTo CLE (cross-platform) are available for download. Documentation is also provided.
2014-Apr-17: DEiXTo_22.214.171.124 (Windows only) is available for download! You MUST first read the freeware licence agreement before downloading or using it. This is a Windows-only application since it requires the Internet Explorer’s HTML parser and render engine.
If you have used DEiXTo before, we would appreciated your feedback in our Testimonials page. Thank you!
GUI DEiXTo – Recent Changes
- 126.96.36.199: Minor improvements.
- 188.8.131.52: Matching the text of the “Next Page” link in multi page crawling scenarios is now case sensitive. DEiXTo CLE was also updated on this.
- 184.108.40.206: Saving the extracted data from the Output tab now works as expected 😉 Moreover, the ability to introduce delay between successive http calls was added (this is not saved in the wpf file though – it is an application setting).
- 220.127.116.11: Saving the extracted data from the Output tab, now correctly takes into account the check state of the “Extract record’s native URL” checkbox, in the Project Info tab.
- 2.9.8: Fixed an issue regarding validation of wpf files against wpf.dtd.
- 2.9.7: After user demand, the minimum height of the application window was reduced to 600 pixels.
- 2.9.5: Interface improvements that better utilize large monitors.
- ability to save the scrapped data in tab delimited txt format during wrapper testing/tuning
- unicode support in the extracted data listview during wrapper testing/tuning
- deixto window resizes better now
- some minor pixel level adjustments in the various GUI elements
- improved DOM handling (those rare access violation errors were eliminated)
GUI DEiXTo Documentation
DEiXTo CLE (Command Line Executor)
DEiXTo CLE is a stand alone utility (implemented in Perl) that executes wrappers created with GUI DEiXTo. For the complete list of command line parameters supported, please check the readme.txt file in the distributed zip file. DEiXTo CLE comes into two flavours:
- 2014-Jan-26: DEiXTo Executor v.1.4.0 for Windows.
It has been tested successfully on Win 2000, XP, Vista and Win7.
- 2011-Jan-06: DEiXTo Executor v.1.3.0 for GNU/Linux.
It has been successfully tested on Ubuntu 10.04 LTS, Fedora 13 and Lucid Puppy 5.0.1. It will probably run without any issues on other modern GNU/Linux distros as well.
Both versions of DEiXTo CLE are released under the terms of the GNU General Public License (GPL) Version 3.
DEiXTo CLE – Recent Changes
- added a pagenc command line parameter to force the use of a certain encoding
- the “follow next page” mechanism is now case-sensitive
- added a check in _getCharsetFromHeader (if $charset is undef, then it should be utf8)
- 1.3.0: database support and a powerful post-processing mechanism was added
No installation is needed to use DEiXTo CLE – you just run it from the command line. By default, the executor complies with any robots.txt file existing on the target website. However, you can override that by setting the ‘-nice‘ command line option to 0 (zero). Respecting the webmaster’s requests and keeping out of pages that have access restrictions is up to you!
DEiXTo and DEiXTo CLE are distributed in the hope that they will be useful but without any warranty. The entire risk of using it is assumed by you.