Welcome to the TESS homepage. TESS is the TElegraph Screen Scrapper. It is part of The Telegraph Project at UC Berkeley. TESS is a program that takes data from web forms (like search engines or database queries) and turns it into a representation that is usable by a database query processor.



NEW: HTMLGet is an interactive command shell which can be used to retrieve web pages, and run TeSS over them. It provides limited support for extraction and submission of HTML forms.


Download:
The code is still in its alpha stages so don't be surprised if it breaks. And send in those bug reports to tessinfo@cs.berkeley.edu
HtmlGet.tar.gz

Resources:
  • HtmlGet documentation
  • A tutorial on using TESS
  • A quick reference of the directives allowed in a TeSS .jsc file