Reaching Decentralised Search Engine

------------------------------------ # just remarks


---- - 17yr+ old, but still sort of immature & hardware inefficient - sadly enough the best p2p search engine we have - P2P - distributing index (DHT) + distributed search - own crawler - written in java -> poor performance -> [homepage] -> [github] -> [my fork] -> [forum] -- author is concentrated on another project, cloud variant without p2p features -> lack of support - p2p remote search unusable for me - disable DHT in, enable DHT out as a broadcast & backup of crawled content - needs a lot of RAM - had to buy maximum RAM the server was capable to hold - to speed up, use a external solr - it needs some more RAM otherwise it crashes regularly freebsd - edit: /usr/local/etc/ SOLR_HEAP="2048m" --> [article, czech] Decentralizovaný vyhledávač YaCy: indexujte a vyhledávejte si po svém --> well... search - fulltext media search, available time to time


---- - bad article date detection - could be solution? - or experiment with dates_in_content_dts ?


------------- - perfect for a local library of eg. books - searching in books is sometimes even better than searching the web - books usually don't contain spam ;-) - optional web interface as standalone python script or apache module - able to OCR using tesseract:


----- - metasearch - mature & usable - lot of plugins for various search engines (out of nature, the plugins change often, if some particular doesn't work, check github for latest version) - if you combine duckduckgo, startpage, bing & yahoo (and/or locals as seznam or yandex), you don't need google any more - privacy - interface between you and search engine - on the other hand sends all your requests to all of them - balance between: public instance = huge load // private instance = less privacy - able to bring together recoll and yacy in a single search page -> good interface to connect 'em all - lot of dependencies, in fact it builds on browsers to pull queries out of the search engines - lot of public instances - relatively fast (depends on number of search engines enabled) EOF Comments requested ~~~~~~~~~ Binary Sxizophreny - index of comp related stuff Kangaroo's Homepage (czech)