Welcome to DEiXTo!
DEiXTo (or ΔEiXTo) is a powerful web data extraction tool that is based on the W3C Document Object Model (DOM). It allows users to create highly accurate "extraction rules" (wrappers) that describe what pieces of data to scrape from a website. DEiXTo consists of two separate, standalone components:
- GUI DEiXTo, an MS Windows™ application implementing a graphical user interface that is used to manage extraction rules (build, test, fine-tune, save and modify), and
- DEiXTo Executor, a stand-alone extraction rule executor (command line utility) that massively and automatically applies extraction rules on target HTML pages and produces structured output in a variety of formats.
DEiXTo can contend with a wide range of web sites with high precision and recall, since it provides the user with an arsenal of features aiming at the construction of well-engineered extraction rules. Wrappers built with GUI DEiXTo can be scheduled to run automatically providing periodic and automated access to resources of interest, saving users a lot of time, energy and repetitive effort.
ΔEiXTo is an acronym for Data Extraction Tool.
First of all, Δ is the equivalent of D in Greek. Now, you are propably wondering what is this “i” character all about.
Well, in Greek “ΔEIXTO” (pron. dechto)
is the imperative form of “point at” which is what the DEiXTo user does inside
a browser window when he starts building a DEiXTo extraction rule.
Now you know... ;-)
