This search system was made as part of my work at Catalyst IT Ltd. It was created to fill a niche that I saw as unfilled – a search system that manages almost all the job but doesn’t interfere with the presentation layer in an existing site. This is especially important for dynamic websites which include their own layout systems, access control, user login or other dynamic content.
There are already systems which either do the whole job, right through to providing a CGI script, or there are the heavyweight enterprise search systems which require the knowledge and time to set up Java servlets, integrate with XML-driven interfaces and fill out pages of configuration.
The engine includes the ability to spider one or more sites, indexing HTML, text and PDF content. The only configuration required for this is a list of sites. The search script can then be run from any system capable of reading in a simple YAML document. Some assistance with paging results is provided but how you present results remains up to you. It is equally suited to a web page search form or a personal, desktop search engine.