Forum archive found #287
Replies: 4 comments
-
I did think about doing this a while ago. Downloading all the forum pages
would actually be relatively easy. Getting it into a state that would be
searchable, and so of use, is a bit tougher, and basically it fell down my
list of priorities.
Tim
…On Sat, 7 Dec 2024 at 00:11, pakricard ***@***.***> wrote:
Thanks to @dalers <https://github.com/dalers>, he found
https://web.archive.org/web/20201025015925/http://www.weberp.org/forum/showthread.php?tid=8169
to help with a discussion.
Some links are dead-ends, but some are working.
I also found https://web.archive.org/web/*/http://www.weberp.org/forum/*
that contains more than 10.000 URL related to the forum.
Someone could crawl these links and try to rebuild (somehow) the forum?
It's greatly missed.
—
Reply to this email directly, view it on GitHub
<#287>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAL6LFD4BZHE2JCYIKPNUT2EI4JNAVCNFSM6AAAAABTFUBCGSVHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZXGYZDMMRTGE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I've thought about this too, perhaps using HTTrack to download web pages from the internet archive and a unix shell script to extract text and images from the pages and create new topics in the GitHub project Discussions. However, the more I think through the details the less "relatively easy" it seems. I think I would also have to dig into phpBB to understand how posts, topics and replies are referenced using the API/URL in order to follow, extract and re-create threads. My feeling is the same as @timschofield, there is higher priority work to do (a PHP8.1 compatible v5 release with no significant issues) and while a re-created forum could provide reasons why something was done some way, the work to re-create the forum would be significant and most of the recovered content would not directly benefit a v5 release. However, I would certainly welcome someone taking the initiative to do the work, and I'd happily help if I can. |
Beta Was this translation helpful? Give feedback.
-
I just wrote a bash script to download the threads. It worked fine, but
then quite quickly their web server blocked me. I assume they block
attempts to use scripts to download en masse. It's possible that if you put
a random delay between attempts you might be able to fool the system, but I
don't know that it's worth the effort.
I was angry that the forum was deleted without taking a backup, but I think
we just have to move on. That's why I wanted to use the GitHub discussions,
so nobody can just delete the knowledge that the community freely gives.
Thanks
Tim
…On Sun, 8 Dec 2024 at 20:35, Dale Scott ***@***.***> wrote:
I've thought about this too, perhaps using HTTrack
<https://www.httrack.com/> to download web pages from the internet
archive and a unix shell script to extract text and images from the pages
and create new forum posts. However, the more I think through the details
the less "relatively easy" it seems. I think I would also have to dig into
phpBB to understand how posts, topics and replies are referenced using the
API/URL in order to follow, extract and re-create threads.
My feeling is the same as @timschofield <https://github.com/timschofield>,
there is higher priority work to do (a PHP8.1 compatible v5 release with no
significant issues) and while a re-created forum could provide reasons why
something was done some way, the work to re-create the forum would be
significant and most of the recovered content would not directly benefit a
v5 release. However, I would certainly welcome someone taking the
initiative to do the work, and I'd happily help if I can.
—
Reply to this email directly, view it on GitHub
<#287 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAL6LFJQXTGFDGXFGTZJOL2ESUSPAVCNFSM6AAAAABTFUBCGSVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCNJQGE3TIOI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Makes sense. At least we can go there if we are lost and manually try to recover some info. |
Beta Was this translation helpful? Give feedback.
-
Thanks to @dalers, he found https://web.archive.org/web/20201025015925/http://www.weberp.org/forum/showthread.php?tid=8169 to help with a discussion.
Some links are dead-ends, but some are working
Examples:
I also found https://web.archive.org/web/*/http://www.weberp.org/forum/* that contains more than 10.000 URL related to the forum.
Someone could crawl these links and try to rebuild (somehow) the forum? It's greatly missed.
Beta Was this translation helpful? Give feedback.
All reactions