Author Topic: Starting an archive of Apollo documents  (Read 5367 times)

Offline ka9q

  • Neptune
  • ****
  • Posts: 3014
Starting an archive of Apollo documents
« on: January 10, 2014, 08:46:21 AM »
I know I kicked this around a while ago but I can't seem to find the old thread.

Anyway, I'm increasingly frustrated in trying to find Apollo documents on NASA's NTRS. Searches are hard to conduct, documents disappear, links change even for documents that don't go away, and of course the entire site goes down when the government does. It's hard to refer people to them without putting them up on my own site.

I have amassed a pretty large collection, mostly from the NTRS but also from elsewhere, and would be interested in merging it with anyone else's to make a comprehensive archive. I am willing to contribute the web server for the project but as I am a communications and low-level networking guy, not a web programmer, I could use a volunteer to help set up the web interface and submission mechanism.

The first step would be to just collect everything everybody has and cull out the duplicates. (I have a fast tool for this, at least when the documents are bit-for-bit identical.) Then cull out the non-bit-identical duplicates, discarding inferior or incomplete versions (or moving them aside so we can more quickly recognize them if resubmitted). Many of the documents have duplicate or out-of-sequence pages that need to be fixed. And of course there's the big job of sorting and indexing everything by mission, system, topic, etc. We will want revision control, starting with the originally submitted copy.

People could contribute however much effort to this as they want, and it can continuously evolve just like the ALSJ and AFJ; I'd be grateful just to have everyone else's archives so I can see how much I already have. I will try to find or build a tool that will allow people to determine if the document they have is already in the collection so it doesn't have to be uploaded again. Basically you'll compute a hash of the file locally and look it up in the online collection of hashes of existing files. If it matches, it's a dupe.
« Last Edit: January 10, 2014, 08:48:16 AM by ka9q »

Offline Noldi400

  • Jupiter
  • ***
  • Posts: 627
Re: Starting an archive of Apollo documents
« Reply #1 on: January 11, 2014, 02:08:16 PM »
I'll be happy to throw in all the Apollo documents I've collected - I'll have to cull through them a bit, as they're mixed in with some related documents I've DL'd from time to time for specific rebuttals.  Where do we send them and when would you like to start?  Would a zip file be better than a bunch of individual files? You might be able to control your data flow a little better that way.
"The sane understand that human beings are incapable of sustaining conspiracies on a grand scale, because some of our most defining qualities as a species are... a tendency to panic, and an inability to keep our mouths shut." - Dean Koontz

Offline Obviousman

  • Jupiter
  • ***
  • Posts: 737
Re: Starting an archive of Apollo documents
« Reply #2 on: January 11, 2014, 07:53:22 PM »
Ditto; I've downloaded heaps from the NTRS as well as other sources; very happy to contribute.

Offline ka9q

  • Neptune
  • ****
  • Posts: 3014
Re: Starting an archive of Apollo documents
« Reply #3 on: January 12, 2014, 01:35:28 AM »
Can either of you run the "sha1sum" command on each of your files?

That command is standard in Linux and many other UNIX-like systems, including Macs I think. It computes a hash fingerprint on each file using the SHA1 standard and prints the 160 bit results in hex. I can compare them to my hashes to see if I already have your file without your actually having to send it.

Try to do this on the original versions of the files exactly as you pulled them down from the net. Changing even one bit of a file scrambles its hash into something completely different.


Offline Obviousman

  • Jupiter
  • ***
  • Posts: 737
Re: Starting an archive of Apollo documents
« Reply #4 on: January 12, 2014, 03:08:06 AM »
Sorry - Windows 7 here.

Offline Echnaton

  • Saturn
  • ****
  • Posts: 1490
Re: Starting an archive of Apollo documents
« Reply #5 on: January 12, 2014, 08:41:42 AM »
While not a trivial exercise, anyone can run Linux on your Windows computer for free by using Virtual Box and one of a number of Linux setups made just for the virtual machine.
The sun shone, having no alternative, on the nothing new. —Samuel Beckett

Offline ka9q

  • Neptune
  • ****
  • Posts: 3014
Re: Starting an archive of Apollo documents
« Reply #6 on: January 12, 2014, 09:40:00 AM »
I don't run or develop for Windows. Maybe we can find an existing tool to compute sha1 hashes, it's a pretty common function.

Here we go: http://lists.gnupg.org/pipermail/gnupg-announce/2004q4/000184.html
« Last Edit: January 12, 2014, 09:42:06 AM by ka9q »

Offline Trebor

  • Earth
  • ***
  • Posts: 214
Re: Starting an archive of Apollo documents
« Reply #7 on: January 12, 2014, 11:40:40 AM »
I have a handful of Apollo documents which might go in.
I'll PM you the sha1's.

Minimal web programming experience though.

Offline grmcdorman

  • Earth
  • ***
  • Posts: 149
Re: Starting an archive of Apollo documents
« Reply #8 on: January 12, 2014, 12:09:00 PM »
There are lots of apps for Windows with a graphical UI instead of the command line, including "portable" versions (no-installer, designed to run from USB drive). This one: http://www.ov2.eu/programs/rapidcrc-unicode includes SHA1 support, among a lot of others.

Edit to add: Portable version of Rapid CRC Unicode is here: http://portableapps.com/apps/utilities/rapid-crc-unicode-portable

Offline Noldi400

  • Jupiter
  • ***
  • Posts: 627
Re: Starting an archive of Apollo documents
« Reply #9 on: January 12, 2014, 02:28:35 PM »
I'm certainly willing to try - I need to look into the sha1sum thing a bit, though.  I'll get back to you soonest.
"The sane understand that human beings are incapable of sustaining conspiracies on a grand scale, because some of our most defining qualities as a species are... a tendency to panic, and an inability to keep our mouths shut." - Dean Koontz