This has lasted long enough. I am tired of this problem, unfortunately I have discovered that this problem is just a subset of an even broader, more general problem. Let me explain. I browse the internet with hundreds of tabs and tens of thousands of bookmarks. Back when I was using Firefox (2003-2006) I had little imagination and taste for enhanced browsing experiences, so I didn't know how to notice that Firefox has the memory leaks and poorly simultaneously load pages, even though I was doing lots of AJAX-like development (XUL, JS, DHTML, what have you). But when I saw an Opera power user, I was hooked on the new method: open as many tabs as you want, browse to where-ever you want to go, bookmark bookmark bookmark (kind of like the programmer's mantra "document document document").
The problem is that within a year (by 2007) I had too many bookmarks. Not in terms of information overload and managing my attention, but computer memory management. Firefox crashes and burns with 10K bookmarks, as does keditbookmarks and many other programs. Opera doesn't (well-- it does if you use the "Add bookmark" menu item, so you have to drag and drop instead). But here's what Opera is doing: it loads this giant poorly-formatted ADR file into memory. The entire thing. In my case, that's 2.4 MB of raw bookmarks. The advantage is that I can quickly search for any term I like, and if it's in the description of the link, the URL or the link name, it finds it. That's really nice. I can scroll effortlessly, click to other categories, but the moment I try to edit a bookmark or make a new top-level folder, Opera nearly freezes/dies. Luckily, it doesn't, but this is getting ridiculous.
The amount of time to open up a new tab is just CTLR+T. Just two buttons. Then you type in your URL or tab over to the Google search panel and off you go. The amount of time to click on a result? Over the years I have trained myself to search for the best results on a page, so I can pretty much target something as soon as it pops up on screen (think of it like a game) -- unfortunately this is a lot of bias on my part, but I have limited time (we all do - until we fix some other problems).
The amount of time to bookmark a page? Clicking the bookmark-this menu item makes Opera whirl and whine in the background because it's getting a list of all of the folders, and it will be at least 10 to 120 seconds before it gets back, if it doesn't die. Okay. Drag and drop? Well, that takes time because you have to go into the bookmark manager (as a tab), find the category (which I am not complaining about), and then you have to go scroll through your list of open tabs and find the tab, then click and literally drag it over to the place where you want to drop it. That takes significantly more time than CTRL+T -- so it takes more effort to categorize it. That might be okay, if it wasn't for the 2-3 seconds that it takes to open up an "edit bookmark" dialog.
So I have been going into #opera and the Opera.com forums and other places on the internet for at least a year, asking around for solutions to my problem, getting the same suggestions each time. I have some notes, somewhere, for alternative bookmarking systems, but it seems like there is always going to be the same fundamental problems. Suppose that I make a bookmark manager that had an SQL backend. There would be a GUI pane for the hierarchical list of categories, and then another pane for the bookmarks in the currently selected category. Suppose that there are 2k bookmarks in the current category. If you load up 2k bookmarks (minimally: title and ID), you can display a limited number of items due to RAM restraints. That's okay, I guess. But if that "2k" suddenly becomes more than RAM could handle, what happens then? You start coding scalability requirements so that you only request enough SQL data to fill up the current part of the screen and a bit more. But then just try scrolling through your bookmarks. Each time you try to scroll through 'em really fast, you'll have to make a query to the database and you get the same lag again, this time distributed to the view/find action instead of the add/edit action. (In fact, the add/edit action would still take some time, but it doesn't have to be upfront. Instead, you can just pop up a GUI and say "Add bookmark?" and edit the data there, then send it on a queue to be processed by the SQL database in the background. But if you were to hover over a bookmark and wanted data on it, it would take a few moments for the SQL data to be retrieved, same with editing. Is this lag necessary?)
This problem exhibits itself in other ways as well. The programmer's documentation problem: the more you code, the less you have documented. The blogging problem: the more you blog, the less time you have to go live your life to come back and blog about (yes, I know there's a "balance answer" but perhaps there's a way to show that blogging is orthogonal to the actions that you are otherwise taking, therefore showing that different actions can be working on the same problem space even if they don't appear to be). The caching problem: the more you cache, the less mobile you become (you have to decide what to uncache). Hopefuly, this is a problem that can be solved via perceptual shift of my view of the problem space.
On further consideration, bookmarking is a lot like nutch and search engines, indexing, crawling, etc. In fact, that's exactly what bookmarking is -- cached crawling and spidering. I was thinking about my previous solution, conceived of in 2006 or 2007, where a flat file system would host one link per file, or files with multiple links inside. And as I was thinking about that a few moments ago, I realized that searching through the flat files would be a problem (indeed, this is probably why I didn't implement the system). This searching problem is solved by computer cluster clouds like Google: they just have multiple machines crunching away on the data set, they distribute the work load to various boxes so that they can all contribute to searching through the index for the relevant information. Perhaps my bookmarking problem could be solved if I had a local cluster? A nonlocal cluster would be problematic because of the lag: the time that it would take to locally search through 10,000 bookmarks would be O(n) and on the cluster it would be somewhat less (right?), but the time that it would take for the data to be sent back to me would kill all benefits of the clustering, or would it? It would take at least 50 to 100 milliseconds to complete the HTTP transaction (or any other transaction protocol: it doesn't matter). The cluster generates many results from many individual nodes, and usually the results are all organized by a master node that is serving the current transaction, then all of the data is linearly forced down the line to the single computer node that initiated the query, and the results are linearly displayed. All of that 'linear' stuff probably makes it all that much more worse. There are conceptual links to graph theory, computer science, big o notation, graphviz, mathematics, evolutionary engineering, and all sorts of other topics here, but they are hard to find, the principles are hard to extract. For example, perhaps a relevant principle might be one that states that linear constraints should be avoided as often as possible.
"You are what you cache." Or you are what you hoard ("hoard: save up as for future use"). You are what you do. "What you do to get somewhere becomes who you are once you arrive." "You are what you cache, so pay attention to where and how you browse. " "You are Now."
Bookmarking is just a subset of the broader problem of caching parallel functionality, the ability to have massive amounts of work done and then to get the information all back to one place in some organized fashion. The amount of time that it takes to scan a data set can be reduced, but then there's information overhead for the management of the cluster. I suspect there are some equations that can help elucidate when it is practical to use a cluster for a certain problem space, and when it would take less time to have the lag on the user's immediate interface. I suppose one trick might be to let the user decide which mode he wants: does he want to cache as much data as possible at once, therefore limiting the amount that he can edit, or does he want to cache the functionality that would allow him to edit and add anything he likes, but at the cost of communication lag? When is it practical to cluster a problem?
Cells capture/cache functionality as well. They cache the resources and nutrients coming in from the blood stream. They cache the molecular messengers from the neighboring cells and the organs of the body. Cells even cache DNA, genes, functional cascading networks for biochemical/metabolic processes too. While there is a natural progression towards senescence, as cells and other life forms tend to give more than they cache as they grow older (and cache more than they give when they are younger), the phases can be controlled.
It's almost as though this object in hyperspace, glittering in hyperspace, throws off reflections of itself, which actually ricochet into the past, illuminating this mystic, inspiring that saint or visionary, and that out of these fragmentary glimpses of eternity we can build a kind of map, of not only the past of the universe, and the evolutionary egression into novelty, but a kind of map of the future.
The other day I was considering a method of overcoming the problems with parallel computation, evolutionary engineering etc. The idea is to offload some of the processing to other individuals in society. "Response is key." I tend to cycle through my projects, ideas, bookmarks and other items of interest to see if I can get a response out of my cached community throughout the years, as it seems that there has to be some reciprocality in order to make progress, though not always -- I am still hoping there is a way to "bruteforce it". If you can prime the contexts up to a certain extent, I think it is possible to encode some information to be processed by nature, and then get the results back. This may be more trouble than it's worth: you would have to know how to tap into various people, how to make an optimal meme and have it spread, how to get the output of the computation of the 'society'. Tough. May be more practical to focus on actually building systems to solve specific problems.
I wrote the above document sometime in 2006 or maybe 2007. I don't remember anymore. More recently I have come up with a few solutions to the dillemma, involving a synthesis of computational neuroscience / biofeedback, squid-proxy, and openmosix. The idea is to use clusters to process biofeedback information extracted from the brain, the grammar unfolding scenarios that have been proposed by Ted Nelson and others, and then as you browse, squid-proxy automatically archives everything, unless you specifically delete (via a specific key press), information is tagged with the "unused" brain output information which can be acquired via EROS, fMRI, MEG, brain implants (MEAs), or other forms of neurobiofeedback; as you continue to process the information that is coming in at you, the kernel can slowly (effortlessly) use the openmosix system to slip/slide different programs across the cluster and have it "sift" further down and out of your way, but still being ultimately accessible in a methodolical, archival manner (known locating scheme). The neuroscience/neuroengineering behind the idea of the reduction of attentional effort required to do 'hard' things simply implicates the reduction of overhead via offloading "unused" information that doesn't make it to your fingertips and so on, and then correlating that information to do the tagging and "real-time programming" so that the functionality of the websites can be more thoroughly integrated or whatever, so that they don't just become this 'stale' representation of yet another item on the todo list to process, but rather something that is actively, immediately digested, with minimal overhead in click-click-click form, but maximized retention / processing / focus on the relevant tidbits. There is no reason why one has to limit himself to one brain per lifetime, and this begins to get into the ideas of neurofarms, building brains, and exponential growth. On those last two topics, see recursion.html and exp.html;