On backups/redundancy

Recent events gave me cause to consider my personal data backup and redundancy strategy for my Debian installs. Or, more accurately, it caused me to amend my half-baked and semi-implemented existing approach so that I won't lose data or have to reconfigure things from memory/scratch in the event of a hard disk failure.

My present “backup” approach is really somewhere between a time-limited backup and redundant storage. Essentially, I use Unison to synchronize my home folder (with certain sub-folders ignored, e.g. certain git repositories, config and thumbnail cache folders) between my desktop and my netbook. I have to run Unison manually, so I end up synchronizing my data every week, give or take. This effectively kills two birds with one stone: I get to have local copies of my important data as up-to-date as my last sync for when I'm on the road and using my netbook, and …

Read this post

Copying an existing Linux system to a new hard drive

I recently upgraded my home desktop's hard drive, because the old one was getting a bit full. Googling for instructions about how to transfer an existing system onto a new drive, many posts suggest using the cpio command, and that's what I tried. While this command does the job for the most part, there is one caveat which I encountered that makes cpio not the ideal tool to use.

Don't use cpio to clone filesystems. Why? Because GNU cpio doesn't support access control lists (ACLs) or extended attributes (xattrs).

Using cpio will end up a little screwy in some edge cases because of this. The particular case I ran into involved folders in /media that are managed by udisks2. udisks2 creates personal mount folders under /media with tailored ACLs to help properly manage permissions for permission-capable filesystems mounted by regular users. If you already have one of these personal mount …

Read this post

Adding the binary entropy function to LibreOffice Calc

Lately at work I've been doing some data analysis in LibreOffice Calc that requires the binary entropy function. The function itself looks like H2(x) =  − xlog2(x) − (1 − x)log2(1 − x), where 0log2(0) is taken to be 0. It's this latter point that makes things a little tricky. LibreOffice Calc doesn't have this function built-in, sadly, and you have to explicitly guard for the case where x is 0 or 1, which is not easy to pull off inside a cell.

So I wrote a basic macro that implements it:

Function BINENT(x)
    If x = 0 Or x = 1 Then
        BINENT = 0
    ElseIf x > 0 And x < 1 Then
        BINENT = -(x*Log(x) + (1 - x)*Log(1 - x))/Log(2)
        BINENT = Null
    End If
End Function

To be able to use BINENT in Calc, go to the “Tools” menu, “Macros”, “Organize Macros …

Read this post

How to subscribe any email address, including a GMail alias, to a Google group

A few of the interests I have and projects I follow use Google Groups to keep in touch. Essentially, these are just mailing lists that Google maintains. At any rate, while it's possible to subscribe to a group without having a Google account by sending an email to (group name)+subscribe@googlegroups.com, if you do have a Google account and use GMail with aliases, it can be tricky to ensure that you subscribe using your preferred address. This is because Google, in its infinite wisdom, will subscribe your GMail address if it detects you attempting to subscribe under any alias that's associated with your GMail account. That has the side-effect that you won't be able to post to the group with anything but your GMail address.

Luckily there's an alternative subscription mechanism (which I spotted here). If you instead use the Google Groups web interface, Google won't have the …

Read this post

How to find, and obliterate, large files in the history of a subversion repository

Sometimes, as I have, you'll find yourself working with colleagues who, through no fault of their own, are either not acquainted with the etiquette of Subversion repository use, or simply have an accident. What you may then end up with is a repository that contains one or more giant blobs of useless data that, really, should never have been added in the first place. Whether or not the culprit well-intentionedly removes these giant blobs in subsequent revisions, you're still left with a huge chunk of nothing-much wasting space on your server's hard drive.

Though a long-standing item on Subversion's wishlist, there is no command that will simply obliterate files from the repository's history. Nevertheless, there is a way to achieve this. Here's how.

The first step of the process is to determine which files need to go. (Some snippets in the following are derived from StackOverflow and Christosoft blog.) First …

Read this post

← Previous | Page 2 of 2