From: Thimo Neubauer (tneubaue_at_ix.urz.uni-heidelberg.de)
Date: 30. Oct 1999
On Sat, Oct 30, 1999 at 12:35:26PM +0100, Matthias Dumke wrote:
> Darf ich Euch bitten, mir Hinweise zu geben, wie man eine lokale
> Suchmaschine fuer seine Webseiten einrichtet?
Spontan fallen mir dafuer zwei Programme ein:
Package: glimpse
Description: Full-text indexing and searching tools
Glimpse is a very powerful indexing and query system that allows you to
search through all your files very quickly. It can be used by individuals
for their personal file systems as well as by organizations for large data
collections. Glimpse is the default search engine in Harvest.
Package: htdig
Description: WWW search system for an intranet or small internet
The ht://Dig system is a complete world wide web indexing and searching
system for a small domain or intranet. This system is not meant to
replace the need for powerful internet-wide search systems like Lycos,
Infoseek, Webcrawler and AltaVista. Instead it is meant to cover the
search needs for a single company, campus, or even a particular sub
section of a web site.
.
As opposed to some WAIS-based or web-server based search engines,
ht://Dig can span several web servers at a site. The type of these different
web servers doesn't matter as long as they understand the HTTP 1.0
protocol.
.
Features:
* Intranet searching
* It is free
* Robot exclusion is supported
* Boolean expression searching
* Configurable search results
* Fuzzy searching
* Searching of HTML and text files
* Keywords can be added to HTML documents
* Email notification of expired documents
* A Protected server can be indexed
* Searches on subsections of the database
* Full source code included
* The depth of the search can be limited
* Full support for the ISO-Latin-1 character set
glimpse kann beliebige Objekte indizieren und ist fuer eine Website
hochstwahscheinlich Overkill. Dafuer laesst es sich toll in eine
Harvest-Hierarchie einbetten :-)
Fuer Websites ist ht://Dig daher wohl eher geeignet. Ein Nachteil
dieses Programms ist allerdings, dass es zum Erstellen der Indices
eine Menge Rechenzeit verbraet (es macht wenig Spass am Rechner zu
sitzen, wenn er seine Datenbanken aufbaut, ich hatte ht://Dig naemlich
mal als Indizierer fuer meine wwwoffle-Daten am Laufen)
Bis die Tage denn
Thimo
-- Thimo Neubauer <thimo_at_debian.org> Debian GNU/Linux 2.1 released! See http://www.debian.org/ for details
Dieses Archiv wurde generiert von hypermail 2.1.2 : 11. Mar 2002 CET