Suck it down - archiving all old posts

If anyone is interested, here’s progress on making an archive of all the old posts

I’ve reposted this as a new topic because if you’re like me, the sidebar defaults to ‘new’ not ‘latest’ , and so when I go to Mechanical Investing it doesn’t even show me my post updating progress (and asking for help) on this issue because it’s not ‘new’, I have to hit ‘Latest’ to see it. BTW, if a non-expert like me can get this far, surely TMF with paid professionals can easily do an archive so we don’t lose all the valuable information in the old posts.

OT:
How to make sidebar default to Latest? I think I asked this before but (a) I don’t think it was answered and (b) I couldn’t find it if it was.

3 Likes

Jeez – Hit the link. I don’t know why it highlighted part of it (I guess the beginning), ignore that, hit the link to see progress on this issue

1 Like

tedthedog,

Have you seen this post by MarkW - author of the unofficial search engine for the old boards:

I think he has already done what you want someone to do.

1 Like

It sounds like you have things working. Back in 2003 I was asked by a poster who was going to write a book to archive the Retire Early board. I used wbel, on a Mac at the command line. Info had been on the MI board, but the link in this post no longer works, and my first google search for a partial search of the URL brought me to this post: https://discussion.fool.com/dneuman-nice-to-meet-you-at-micon-you-and-i-14677511.aspx?sort=postdate

It used java, with an archive tool called foolpull that someone else created (see above).

And looking in one of my files, it looks like you would have commands like these:

<start at message 1 of the PIXR board, getting 200 threads in 50 thread chunks>
java -jar /webl/webl.jar /foolpull/mfarch.webl -c 106980 1 200 50

I don’t remember the specifics, and don’t know if the tool still works on the current setup. It’s been 19 years! But it looks like the resulting output is pretty condensed.

It looks like I wrote some “readme” files at the time to explain how to set things up and archive. The instructions were for this other poster, and he ended up just having me archive things and send him the files.

I have other things going on these days, but I might play around with it just for kicks. If other things don’t work for you, let me know and I’ll try to prioritize it just a bit more.

I admin/moderate another investment site these days, and we had a similar thing about 10 years ago, forced to move to a new platform. Truthfully it’s like many things in life, where you don’t want to lose the knowledge/data/stuff/whatever from the past, but in reality you don’t really refer to it much, especially in investing where so much is forward-based. Sometimes a place like archive.org already backs things up for you, though the interface might not be ideal and it depends on the settings TMF allows. But if only using it occasionally, it might be a decent solution.

3 Likes

While the current link fails, using that same archive.org site pulls up the page. This link is to a copy made in 2005. And it includes TMF’s policy at the time, of only archiving for personal use, which apparently was why I wrote out process for the other user, instead of just archiving it for him. Such the rule follower.

OTOH, web.archive.org does have some of TMF backed up, but it might just be the list of posts and not the posts in full. So if you click on a post on this listing, it fails. I don’t know if there is a workaround, since my initial basic attempts failed, or if it is the expected behavior for TMF’s archive through them. https://web.archive.org/web/20210708045025/https://discussion.fool.com/mechanical-investing-100093.aspx

Good luck in your attempts.

2 Likes

Thanks StevnFool. I replied to Mark in that thread but I bet he’s no longer reading.
How would I contact him?

Try: willcox@datahelper.com

I contacted him once when MI DataHelper stopped archiving posts, and he was very helpful. Be sure to include your phone number, and good luck.

Charlie

1 Like

4aapl: I don’t see web.archive.org as a solution.

I think I’ve got the data pulled down, at least when I randomly poke at it (see my screenshot above) it looks like the data is successfully there. But hitting ‘next’ and ‘previous’ in the posts obvioulsy isn’t a practical way to navigate, so I was hoping that somehow we could connect Mark’s search engine to the archive.
Or, if he’s already done this, I’d love to know how to access it. There is so much valuable discussion in the old posts, it’s a shame to see it wasted. The simplest solution of course is for TMF to make such a searchable archive of old posts available. Believe me, if I can do it (scrape the data), then it ain’t hard.

3 Likes

WaveDoc: ;will do, thanks!