📜 Website Archive
Find a file
JMARyA 37cd37018f
Some checks failed
ci/woodpecker/push/build Pipeline failed
fix fav
2024-12-30 22:14:39 +01:00
.woodpecker init 2024-12-29 16:51:34 +01:00
migrations add vector db 2024-12-30 14:06:32 +01:00
src fix fav 2024-12-30 22:14:39 +01:00
.dockerignore remove db 2024-12-29 19:35:56 +01:00
.gitignore remove db 2024-12-29 19:35:56 +01:00
Cargo.lock fix 2024-12-30 21:51:00 +01:00
Cargo.toml add vector search 2024-12-30 21:25:40 +01:00
docker-compose.yml add vector db 2024-12-30 14:06:32 +01:00
Dockerfile fix 2024-12-30 22:06:15 +01:00
env add vector db 2024-12-30 14:06:32 +01:00
README.md docs 2024-12-29 23:39:50 +01:00

WebArc

webarc is a local website archive based on monolith.

Configuration

You can configure the application using environment variables:

  • $ROUTE_INTERNAL : Rewrite links to point back to the archive itself
  • $DOWNLOAD_ON_DEMAND : Download missing routes with monolith on demand
  • $BLACKLIST_DOMAINS : Blacklisted domains (Comma-seperated regex, example: google.com,.*.youtube.com)

Usage

Archived pages can be viewed at /s/<domain>/<path..>.
For example, /s/en.wikipedia.org/wiki/Website will serve en.wikipedia.org at /wiki/Website.

To select an archive from a certain time, add ?time=YYYY-MM-DD parameter to the URL.