YaCy is a peer-to-peer search engine. Every peer sets up his own client and is able to crawl and index websites. Searches are carried out by contacting all known peers and cumulating their returns. It is not necessary to have a web server for that. You may well install YaCy on your office computer but of course it only works as long as it is connected to the internet.
A long time ago I maintained a YaCy peer on my web server. Later I lost interest because there were (and still are) too less peers online to be a reasonable alternative to Google. Usually only a few hundred concurrently. But to flatter my vanity I now decided to set up my own peer again mainly to introduce several websites I am part of the admin team. Main issue now was that my webserver employs Nginx as reverse proxy and I do not want to expose additional ports to the internet (YaCy’s default ports are 8090 and 8443). Good luck, due to the Docker image the install procedure proved fairly easy! Both Nginx and YaCy need the default settings only!
In order to use Nginx as reverse proxy its configuration needs to contain some special commands. My default proxy_params file is longer than its pendant in the Nginx GitHub repository:
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
client_max_body_size 100M;
client_body_buffer_size 1m;
proxy_intercept_errors on;
proxy_buffering on;
proxy_buffer_size 128k;
proxy_buffers 256 16k;
proxy_busy_buffers_size 256k;
proxy_temp_file_write_size 256k;
proxy_max_temp_file_size 0;
proxy_read_timeout 300;
This proved good enough. Installing YaCy from Docker requires only two comands (head over to this particular site to learn how to backup and update your instance):
docker pull yacy/yacy_search_server:latest
docker run -d --name yacy_search_server -p 8090:8090 -p 8443:8443 -v yacy_search_server_data:/opt/yacy_search_server/DATA --restart unless-stopped --log-opt max-size=200m --log-opt max-file=2 -e YACY_NETWORK_UNIT_AGENT=mypeername yacy/yacy_search_server:latest
We do not need settings for TLS in YaCy since this is done bx Nginx (employing Let’s Encrypt in this case). Since YaCy’s internal links are all relative, we can proxy the localhost without caring for host name and protocol schemen. The following Nginx server is fully operational:
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name my.host.name;
root /var/www/my.host.name;
index index.html index.htm default.html default.htm;
location / {
proxy_pass http://127.0.0.1:8090;
include /etc/nginx/proxy_params;
}
access_log /var/log/nginx/my.host.name_access.log;
error_log /var/log/nginx/my.host.name_error.log;
ssl_certificate /etc/letsencrypt/live/my.host.name/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/my.host.name/privkey.pem; # managed by Certbot
}
Head over to my search interface. But attention: there is an extended blacklist excluding pseudo science, extremist politics, conspiracy theories and so on (mainly German sites). Use another YaCy instance to get the same search without my exclusions.
Leave a Reply