Tuesday, December 31, 2013

Geo 365: Reflections on The Index and Geo 365 Generally

I didn't even think about it at the beginning. It never occurred to me. Had I known the time, focus, and mental strain it would entail, I never would have started on it. But once I did, it was over a month before I started having second thoughts. By that time, I saw what it could be, and I couldn't drop it, as much as I wanted to.

Yes, it really was all that.

I'm a little terrified to start this post, too, for a similar reason: there's just so much to say, to connect, and to explain. I'm quite certain I'll forget to mention important points that I've been meaning to talk about.

Note: The first part of this post is going to come across as quite negative, but it's important to read and comprehend if "Geo 365: The Index" has inspired you to try something similar. DO NOT approach this task lightly; it's obsessive-making, and a real mind-fuck. If you want to skip the process and "The Ugly," scroll down to the header "The Good," down there somewhere. On the other hand, "The Ugly" makes for an amusing read, as long as you take my comments as acerbic (as intended), rather than whining or complaining. On the other other hand, DO take it quite seriously if you're interested in doing something similar. I hope you might be, but I want to be quite clear about what it entails, and I don't want to be responsible for anyone's institutionalization. Below "The Good" you will find "Emulation Tips," "User Advice," and "Looking Back on a Year of Geo 365."

The Ugly

I Tweeted last Thursday, "REALLY stoked to get this... monstrosity... finished and published next Tuesday. Key numbers: >200, 206, 359, 5288" (as of that Tweet). Intentionally obscure so I didn't spoil the surprise, those numbers stand for the following: in excess of 200 hours working on the damned thing, 206 pages of html, 359 key words, and 5288 links. That captures the magnitude of the task pretty well, but it does not capture the mental strain and drain of actually doing it.

Let's back up a bit here. Two and a half years ago, I made this statement:
One of the things I dearly wish could happen- probably so impractical as to be impossible- is an index of geology blog posts by topic. There must be many tens of thousands of those already, so just filing those already existing would be a herculean task, and as time goes by, more and more geoblogs appear on the radar.
Yeah, drop that "probably." It's impractical plain and simple. Much as I would love to see a single index for the whole geoblogosphere, it ain't happening. It would be impossible for me to go back through just my own geology posts and index them all- my mind reels at the idea. However, the idea of a Geoblogospheric Index has been haunting me for years, even before that particular post.

In mid August, it occurred to me I'd like to have something that would tie the whole pile of Geo 365 photos into a single entity, that would allow people to see it and use it as an active integrated series, rather than something floating in the distant past that was 2013. Yes, Google would bring in some visitors, but they would only see that particular post, not realizing something better related to their interests was only a day/post or two away, or not realize that was a particular moment on a particular trip that has been outlined rather carefully in a sequence of posts. The initial result, a clear list of what each photo is (as opposed to my often punny or opaque post titles), was brought up to date on August 31: The Table of Contents. As I looked at it, I knew that it wasn't what I'd been looking for. It wasn't searchable without a lot of patience. It didn't really tie groups and sequences of photos across days, weeks, and even months together. It just didn't (and still doesn't, to my mind) look "useable," whatever I think that means.

It didn't take long before my coyote trickster of a mind coughed up INDEX! Ah-ha! That would be useable and useful! It was exactly the tool I'd been reaching for, two weeks earlier. So on September first, I started the long slog.

The first pass was fairly straight forward. I wrote in key words as they occurred to me, and starting with January first's post, made a pass through the photo and text. When I came to the first relevant keyword, I would type in "date-comma-space," then link the post URL to the "date" part, and copy that short little sequence into the buffer. Then for following relevant keywords, just paste the same "date/link-comma-space" at the end of the list. Meanwhile, I'd update the TOC from time to time, but it rapidly lost interest to me. It took very little time to keep up with it, though, so I stuck through until the end with it.

But the devil lay in two little details: "relevant keywords," and "I wrote in keywords as they occurred to me." The first became important in the sense of "type one and type two statistical errors." Type one, in this sense, would be associating the photo and/or text with a keyword that's not really relevant. Type two would be failing to associate the post with a keyword that is relevant. You (like me) probably just assumed that associating (or dissociating, for that matter) keywords with a photo and text is straightforward. I'm here to tell you that many are. But many, too many, are not. I had to create mental algorithms to help cut through some of that confusion. One such was that if [keyword] was clearly apparent in the full size version of the photo (NOT as displayed at 500 pixels width in the blog, but the "right-click, open link in new tab, then click the photo for full size view'), it got linked. This necessitated a fair amount of squinty peering at photos to see if I could spot this feature or that. Another was, if I mentioned it in the text, it got linked, whether it was relevant to the overall gist of the post or not. Yet another was, even if it wasn't in the photo or text, if it was intrinsically related, if it was a thing or process someone would need to recognize and understand to interpret the situation geologically, it got linked. So in using the index, you might click through on any of "Pacific Ocean," "Waves," "Water," or "Wave Cut Terrace," and think, "Wait, I'm not seeing any of those except the wave cut terrace." Correct, but that one implies the other three. So if you go to a post for any particular feature or process, and don't see it, try to think it through. At least at some moment in time I thought it was probably relevant. Another was location: if the photo was taken in Newberry Caldera, or at the Pacific Ocean, those locations were linked, even it it's not visible in the photo and I didn't mention it. The final algorithm I'll discuss is that for some keywords, Biology-Geology Interactions, Earth-Human Interactions, and Tectonics come to mind, those things are so ubiquitous, I could in all likelihood justify linking every single post to them. At the very least, I was there photographing it. I think that counts as an Earth/Human Interaction, right? But obviously, such an association would not be useful, which is the whole point. So for some terms, I had to decide if each post was a particularly good, or at least decent, example. There were others, but the above mental go-to decision-making schemes illustrate that the process is anything but straight-forward and quick.

""I wrote in key words as they occurred to me," was actually the killer though, and I don't think it could possibly be avoided. There simply isn't, and can't be, a "ready-made list of terms relevant to stuff you've been blogging about." Some of them are terms that, as far as I know, no one else uses. "East-Side Apron," is one that comes to mind, referring to the gentle slope of tephra as one descends to the east from the Cascades. "The Wall" is a local colloquial name. Such terms are bound by quotation marks in the index, to let users know that they're not (again, as far as I know) in broad usage, or even acceptable. The problem arises when you add a word on June 30th that is likely relevant to a significant number of the previous 180 posts. I finished the first pass September 27, then dutifully started over again at January 1st. By the end of January's posts I was ready to die. Keep in mind, this was actually the THIRD time through the litany of posts, first for the TOC, second for the "first pass" building the Index. (Fourth or more if you consider composing, proofing, editing, and occasional re-reads.) The number of keywords was well over 300 by that point- I'm thinking I've added maybe 30 keywords since I started the second pass, mostly terms referring to things I was confident I hadn't discussed/shown before 9/27, and location names. I've been making a list of terms I wish I had included from the very beginning, some of which are painfully obvious:
  • Differential Weathering/Erosion
  • For Scale
  • Fossils
  • Mass Movement/Wasting
  • Mud/Siltstone
  • Secondary Mineralization
  • Sedimentary Environments (I do have some key words that are specific, but not a general catch-all)
  • Weathering
  • Western Cascades
And I *may* yet do a third pass with those alone, but I'm making no promises.

Now imagine: "indexing" each post entails looking over the full-size photo carefully, re-reading, yet still another time, the text, then reading carefully through a list of some 350 terms, and deciding whether each term is or isn't relevant. That list of terms is in no way structured to be engaging or coherent in any sense related to thought, just a list of (mostly) geology terms, and locations, in alphabetical order. It isn't meant to convey meaning, simply provide others an easy means to find posts related to particular things and concepts. There will, in the end, be 365 such posts, and for each, I will have to read that list twice, and attempt to be mindful of the meaning of each one of those terms. Let me tell you... this is not as easy as it sounds, even if you think it sounds ridiculous to even try. Let's see, 359*365*2= 262,070. So my mind had to struggle with reading over a quarter million words that had no intention of being coherent, while trying to pay mindful attention to each one, and making a decision about the disposition of each one with respect to a blog post, which incidentally, I was also attempting to keep in my mind.

Needless to say, my mind rebelled frequently. Generally speaking, *I* am pretty good at miniscule, detail-oriented tasks. It's a characteristic that has served me well in lab jobs, curriculum analysis and development, and lots of other situations. But my *brain* sometimes boggles; this was, and continues to be, one of those times. My brain. It just... goes away. I'll find myself merrily skimming from the S's into the T's, and realize I haven't registered a single word in the list since "Lignite." Or staring a word, dumbfounded, with no idea what the fuck it means- "Kipuka" was the one most talented at stopping me in my tracks, despite the fact I've known its meaning for three decades or more. But I imagine there's not a single word on that list that hasn't stopped me short at least once, as my brain struggled to find the slightest bit of meaning for each word-after-word-after-word-after-word, a quarter million times.

I *still* haven't completed the second pass, up through September 27. I'm up to August 8. You know what? I don't care. I know there are omissions, type 2 errors, or as I've been referring to them, holes- where something should be, but isn't. And I doubt anyone will notice, but I'll continue to fill those holes, slowly, as my brain permits. The fact is, I'm finding fewer and fewer as I proceed through, just as I expected. What that means is that my brain has to mindfully pay attention to more and more words fruitlessly before it gets that little jolt of positive reinforcement with that "Aha!" and hitting [control-v] one more time. That's not facetious: it is indeed a pleasure to find each connection, and fill each hole. Indeed, that's a big part of what has kept me plugging at it, and will keep me plugging at it, long after today's four Geo 365 posts go live.

The Good

First and foremost, I'm pretty damned proud of this. I've never seen anything like it. Yes, there are still holes. Yes, there are obvious keywords that should have clearly been in the list since the outset, and there are some that are overbroad, and perhaps unnecessary.

Yes, it has some flaws, but they are effectively overwhelmed by the awesome.

Second, this accomplishes exactly what I had hoped, I think. It makes the Geo 365 series an actively useful set, long after any particular day and post is buried in the past. It unifies different posts by location, features, rock types, and other aspects. It makes it easy for other geobloggers and geologists, Oregon visitors and residents, students, and interested others to find organized information on the geologic foundations of this wonderful place.

Third, it served as a constant reminder of how much I've enjoyed just doing this series. "Oh, yeah, I remember that stop... that was a good one!" It's not just the photos, it's reliving those moments with rocks in the wild.

Fourth, it's already been useful to me in tracking down posts and links that are relevant to a conversation, or when I'm working on a current post. By no means has it saved me more than a tiny fraction of the time invested, but it does mean I've included links in posts or conversations, where otherwise, I wouldn't have bothered.

Fifth, with reservations (see below), I'd really like to see others try this out.

Finally, was it worth it? Yes. Every. Single. Minute. My sense of satisfaction with myself at this moment, 8:54 AM PST, as I prepare to put this all to bed, is higher than it has been in many years.

Emulation Tips
  1. Don't start this as a lark. A lot of it is not fun, and not a lark at at all. You won't finish it.
  2. Don't start with a series that's well underway. As I said earlier, if I'd had any clue what I was signing up for Sept. 1, I wouldn't have even considered doing it. If I had started January 1, I think it would have been much, much easier... the current posts only take 15-20 minutes to Index. It's the second pass that's the killer.
  3. Start with a list of basic geology terms, so you don't make the mistake of leaving out weathering or fossils until it's too late. Try to be prospective, and consider what sorts of words you're likely to want before you start. Get that list fleshed out as well as possible before you begin. Any terms not used by the end can be snipped off.
  4. #3 is probably not going to help, and you will still need to do a second pass. It will hurt like hell. The fact is, you will have some holes on your list, just as I did, simply because you miss keywords from time to time, even when they're in the list for the first pass.
  5. Take frequent breaks. You, too, will be asking your brain to store one batch of information, fetch other batches of information, and deciding whether there's a connection between them, rapidly and frequently, over and over and over. Your brain will get tired. And if you ask it to keep going, it will get stuck in that mode to the point where you, too, will be staring dazed and confused at pretty much anything, as your brain tries to figure out what information it's supposed to be storing, what information it's supposed to be fetching, and is your current situation relevant to that. I'm totally serious. I was trying to check out and pay at a local convenience store, and my brain kept asking "Wait, what day are we even on!? What WORD are we on!?" That was the worst case, but there were many other instances in which my brain tried to revert to "indexing mode" in the midst of some other task. I was honestly, truly not exaggerating when I said "it's obssesive-making, and a real mind-fuck."
  6. Accept that it's going to take a lot of time, because it will. Decide fairly quickly- within a month or less- whether it's really worth it. Because once it starts to mature, and you start to see what a magnificent beast it's becoming, you can't resist it anymore. By the time I got to the middle of the second pass, it was becoming a serious threat to what mental health I have, but all I could see was the grandeur of what it would be in finished form. At that point, I was an addict, and there was no letting it go.
  7. The second pass is a killer. It hurts your brain, and your brain does not appreciate it.
  8. The up-side: I don't know the reaction The Index will receive once it goes live, because I'm writing this on 12/29. But the four of the five geobloggers who've seen previews have been very, very positive, and the fifth has kinda dropped off the face of the internet. And even non-geology friends here in Corvallis have been impressed. All that said, I'm less interested in impressing people than in creating a resource that's USEFUL. And I strongly suspect this is. People will learn. A lot. And that's what drives me. If you do get a decent index for some series or blog, people will learn. A lot. If you like to see information organized and readily available, and find reward in teaching, you may find that a powerful motivator. This may very well be the most powerful and (semi-?) permanent teaching tool I have ever created, and I've created a fair number.
User Advice

Okay, everyone reading this presumably knows how to use an index. But a short bit of advice: some terms have many, many links, some a fair number, and some only one or a few. The more specific you can make your search term(s), the more quickly and easily you'll find a specific target. On the other hand, if you just want to rummage through random pictures of Volcanoes/Volcanic rocks, the Pacific Ocean at the Oregon shore, or scenes that are related to water in one way or another, have at it.

Also, the abbreviations I've used are pretty obvious to me, but for explicitness' sake:
  • CP: County or City Park
  • CA: California
  • NM: National Monument
  • NP: National Park
  • NRA: National Recreation Area
  • OR: Oregon
  • SP: State Park
  • WA: Washington
A further note: a question mark immediately before the "date/link" means one of several things. It may mean I think it's a questionable association, that I'm uncertain, but it's my best guess, or that's I think it's probably this or that, but I'm not sure which. In this latter case, it'll likely be listed under keyword "that" as well, also with a question mark.

Looking Back on a Year of Geo 365

I started out very uncertain I even wanted to start out, and, really, fairly certain I would bag it. I was wrong. This post, from only three weeks in, still pretty much nails my attitude toward it. The only thing I would add is that since then, I've been mostly sticking with chronological order through a day or two of various trips, which, combined with the location information, should allow others who would like to visit these spots to do so. The nice thing about FlashEarth is that the URL embeds the lat-lon information, so even if that service goes off line, and the links die, the location is easily extractable from the link, and can be plugged into some other service or device.

I think my main feeling at this point is that I'm sorry it's over. There's so much more to show and tell...

No comments:

Post a Comment