Author Archive

Linux How-Tos and Linux Tutorials : NewsBlur: The Open Source Feed Reader with Brains

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Google Reader is the undisputed champ among Web-based RSS and Atom feed-readers. But while the search giant gets plenty of karma points on the software freedom front, Google Reader's status as a commercial product means that from time to time, features have to come and go. The latest change is the removal of social-networking "share this" functionality, as Google Reader gets merged into Google Plus. The open source feed reader NewsBlur is ready to make a play for your attention, adding not just link-sharing but multi-user rating and intelligence.

Google Reader is the undisputed champ among Web-based RSS and Atom feed-readers. But while the search giant gets plenty of karma points on the software freedom front, Google Reader's status as a commercial product means that from time to time, features have to come and go. The latest change is the removal of social-networking "share this" functionality, as Google Reader gets merged into Google Plus. The open source feed reader NewsBlur is ready to make a play for your attention, adding not just link-sharing but multi-user rating and intelligence.

At Your Service

NewsBlur has been in development since 2010, the brainchild of developer Samuel Clay. Although all of the source code is hosted at GitHub (and under the permissive MIT license), Clay is openly trying to use the service as a funding source, by hosting a NewsBlur service on the newsblur.com domain. You can sign up with a NewsBlur account for free, if you can live with the limitation of 64 active feeds. Alternatively, you can remove the feed-limit and get a pay-what-you-choose plan.

Naturally, the hidden secret in this pricing formula is that the more users who use newsblur.com accounts (free or paid), the better the service's statistics and recommendation engine become. When you sign up, you can automatically import your existing feed collection from other Web services (Google Reader included) using OAuth, or upload an OPML file from another application. Luckily for us feed-hoarders, the importer pops up a dialog box to let you select your 64 favorite feeds if you are using a free account, and attempts to pre-select the 64 most popular.NewsBlur

Once loaded and fired-up, the NewsBlur interface is easy to figure out if you have used any comparable feed service: you get a left-hand-side vertical column showing your feed titles and the unread message count for each, and a larger window on the right with the content for the selected feed. You can place the list of unread titles above or below the pane where you actually read posts and stories.

What is more interesting is that NewsBlur also gives you a choice of three views for each entry. You can view the feed content in feed-reader form (which is just text, images, and embedded objects rendered into the window), or you can switch to the "Story" tab which instead loads the original page associated with the entry. That can help a lot for wonky feeds that use poorly-formatted image or video objects-includes, and for reading the comments attached to an item. Finally, you can switch to the "Original" tab, which loads the base URL of the site. Just so we're clear, these latter two options load the original content in the feed-reader window; they are simply there to help you make sense of content and find additional links, feedback, or features of the site.

Intelligence Features

The "intelligence" features of NewsBlur come from analyzing the ratings and popularity of individual feed sources on individual accounts. You can view site-wide statistics for any feed by clicking on the line-graph icon in the title field; the report will show you traffic trends, subscriber numbers, and a sorted list of user-assigned keyword tags. You "train" the NewsBlur recommendation engine by clicking on the colored-block icon next to the graph icon. Up will pop a window with two panels inside: one listing all of the authors in the feed (gleaned from the RSS or Atom tags themselves), and one listing all of the category and tag names. For each one, you can click on either a thumbs-up or a thumbs-down icon to record your feelings. You can do the same thing for each individual story by clicking on the arrow icon next to the story title.

Training NewsBlur

NewsBlur aggregates your ratings and uses them to predict which stories you'll like in the future. A red-yellow-green slider at the bottom of the feed list allows you to control what you see: green shows you those stories tailored to your tastes, yellow just the basics, and red absolutely everything, including the stuff you'll hate. You can get feed recommendations generated by other users by clicking on the Dashboard link at the top of the feed list.

Of course, NewsBlur also sports social news-sharing features. By clicking on the arrow icon by each headline, you can start it, email it, or share it with Twitter, Facebook, or another service. Currently the list of supported services is quite small — it would be nice to add your own to the mix, so that Identi.ca users can play along. But the code is still young, and after all it is open source.

On Your Server

Since NewsBlur reached the stable point, Clay appears to have been spending most of his time developing mobile solutions for iOS and Android device owners. Although the code is available for download, the installation instructions are on the sparse side.

NewsBlur is written primarily in Python, using the Django framework. It uses jQuery on the client side, and a number of server-side libraries (such as RabbitMQ and Celery) to handle feed fetching and content parsing. The database setup is non-trivial; NewsBlur can use either MySQL or PostgreSQL to manage feed lists, the list of user accounts, and account information, but it also uses a MongoDB databases to store the contents of the actual pages, usage stats, and old stories no longer in the latest feed update. The app needs this information in order to track statistics and analyze story and feed popularity; without it the "intelligence" would not be so intelligent.

Another wrinkle is that NewsBlur is hard-coded to use Amazon S3 as a storage back-end. While this makes sense from newsblur.com's public service angle, is does introduce a difficultly for those interested in running their own, private NewsBlur server. I ultimately decided that, in spite of how much I like the NewsBlur interface and ratings system, I was not interested in paying for an S3 account in addition to the Web hosting plan I already use. I'm hopeful that future releases will allow additional storage options — at the very least WebDAV, which should be sufficient for single-user setups. Nevertheless, I walked through as much of the installation process as I could in order to get a feel for the system.

NewsBlur makes heavy use of the Fabric Python library for lower-level installation tasks and configuration. Fabric will download and install other system prerequisites for you, including MongoDB, but you will still need to manually set up a MySQL or PostgreSQL user and database for NewsBlur to use.

Most of the configuration is done in NewsBlur's fabfile.py and local_settings.py files. The fabfile.py file sets up the server environment, including which machines host the Web app, the databases, and the maintenance tasks, which can be separate machines, but probably do not need to be for a personal site. This file also includes several hard-coded references to the newsblur.com site that you will need to modify to point to your own servers, and possibly tweak some file paths. The local_settings.py file contains database credentials, OAuth and S3 keys, and logging and administrative email setup.

After customizing the config files, you use Fabric scripts to bootstrap the Web app and task server, and you can use your own discretion as to how often to run the administrative scripts that update feed content, collect user ratings, and generate statistics. At that point, it is up to your users to subscribe to feeds and rate content in order to give you useful data to work with.

Ultimately, NewsBlur is probably overkill for a single-user setup; for that a smaller-scoped application like TinyTiny RSS is likely to remain the better choice. However, if you are interested in running a multi-user feed service, NewsBlur is a good place to start. The rating and statistics engines really do give it a level of artificial intelligence that other open source feed readers do not even explore. There are a lot of ways to share interesting content online, and the in-app page renderer is nice, but the ability to learn from people's real-life reading habits is a unique edge.

Linux How-Tos and Linux Tutorials : Weekend Project: Control Your Configuration with Etckeeper

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

When we discussed ensuring a hassle-free upgrade recently, I casually mentioned one of the benefits of using version control to keep track of changes to the /etc/ configuration directory — the fact that it makes it easy to migrate your settings to a new machine. But there are other benefits, too, such as rolling back after accidents, and tracking down unintended changes made by overactive packages. Let's take a look at how etckeeper can help.

When we discussed ensuring a hassle-free upgrade recently, I casually mentioned one of the benefits of using version control to keep track of changes to the /etc/ configuration directory — the fact that it makes it easy to migrate your settings to a new machine. But there are other benefits, too, such as rolling back after accidents, and tracking down unintended changes made by overactive packages. Let's take a look at how etckeeper can help.

Background

Debian developer Joey Hess starting writing etckeeper after unsatisfying experiments with other people's attempts to shoehorn /etc/ into a Git repository. A few people had done so successfully, but ran into two major problems: what to do when a package installation made changes to the directory or a file (i.e., and the user could not enter the usual log entry), and what to do about metadata changes like file permissions. File permissions are pretty important for files like /etc/shadow, but most version control systems (VCS) are not set up to notice them, because VCS is designed primarily for software development.

Hess sought out to combine the home brewed solutions demonstrated by others into an easily-installable package. The result is etckeeper, a highly flexible revision control service for /etc/. It can use not only Git (the original option), but any of a long list of popular VCSes, including Mercurial, Darcs, and Bazaar. It tracks metadata changes on its own, and although you can manually enter changelogs whenever you edit a file, it also does automatic nightly checks to record any updates you've overlooked.

Etckeeper also hooks into all of the major package managers: Apt, YUM, and Pacman-G2. As a result, any changes that are triggered by pre-installation scripts, new packages, or post-installation scripts are automatically tagged in the log.

Installation & Configuration

Etckeeper is already packaged for most of the major distributions, including Debian, Ubuntu, Fedora, and OpenSUSE. If you use another distribution — or simply prefer to compile from source — you can grab the latest release from the etckeeper site. The most recent version is 0.57, from November of 2011. There are no major dependencies for the core package itself, however you do need to have at least one of the supported VCSes installed — etckeeper makes use of your VCS of choice, rather than implementing its own.

The main configuration file is located in /etc/etckeeper/etckeeper.conf, and the first option you need to check is the VCS line. Simply uncomment-out the VCS that you intend to use; most of the distributions will already have chosen a sensible default that works well with its other development tools (for example, Ubuntu defaults to Bazaar). Below this line is a set of COMMIT_OPTIONS lines, on which you can specify command-line options that etckeeper needs to pass to the VCS upon each logged change.

Next are two options for honing down etckeeper's logging behavior. You can uncomment the AVOID_DAILYAUTOCOMMITS=1 line to tell etckeeper not to run a nightly check looking for unlogged changes. You can also uncomment the AVOID_COMMIT_BEFORE_INSTALL=1 line to keep etckeeper from running an extra autocommit before starting a package installation. This is safe to do, because if the autocommit is switched off, etckeeper will stop the installation process, warn you that there is an un-logged change in /etc that you need to worry about, and allow you to perform the commit (or roll it back) before proceeding.

The last two lines ask you to specify the high-level and low-level package managers used on your system — for instance, Apt and dpkg on a Debian-based system, or YUM and rpm on an RPM-based system. Here, too, the distro will probably have the default chosen correctly for you. You may need to initialize etckeeper after installation — but only if your distro's package does not do so automatically. Check the documentation to be sure, but if you need to initialize the system, just run sudo etckeeper init.

Ready, /etc, Run

In practice, etckeeper will run in the background, and your main interaction with it will be through changelog messages you submit after each editing session where you modify a configuration file. Etckeeper needs root privileges to run, but you will also need them to edit the files in /etc/.

For example, if you change the contents of /etc/hosts, you can then log the reason for your change with etckeeper. Type: sudo etckeeper commit "Added new NAS box to hosts file.". Etckeeper will commit the changed file to the VCS, tagged with your commit message.

You can check the history of your changes using the raw VCS tools. For example, on a Debian-based system, type sudo bzr log --line /etc. Bazaar will replay back the date and log messages of each of your changes. You're more likely to remember the details of the change if you made it yourself, but don't forget that etckeeper logs changes triggered by package installations as well. To see the details of last change line-for-line, all you have to do is type sudo bzr diff /etc.

Etckeeper writes detailed log messages to the VCS when tracking package-triggered changes. You will see "committing changes in /etc after apt run" or a similar message, followed a list of the changed packages, with a + indicating an installation, and a - indicating a removal.

Commits made during etckeeper's daily sweep are logged with "daily autocommit" as the message. Naturally, this is the least informative of the changelog messages, but etckeeper has no information to go on in these circumstances. In this case, it's important to remember that the VCS commit logs the actual file alteration for rollback purposes as well. An unidentified change caught by the nightly sweep might not have a clear message, but at least you know it is logged, and you can revert it in the case of problems.

Metadata changes are logged in their own special file, /etc/.etckeeper. By writing the permissions and other metadata values to a flat file, etckeeper gets around the fact that most VCSes do not track file metadata at all. Permission changes are thus recorded in etckeeper's logs as changes to the .etckeeper file. You can read the contents of the file on your own, or view its log with the normal VCS diff commands.

Keep it real

Of course, the principle limitation to etckeeper is that it only monitors the /etc/ directory, while there are other important locations you might want to keep track of, too. But etckeeper is not just a mechanism to use version control on a single directory; it hooks into package management and performs nightly scans for a reason. The /etc/ directory is where changes are supposed to go regularly; if you want to set up your VCS to watch other specific locations (such as $HOME or /var/www/), you are likely to need a lot of customization anyway.

It may still be a worthwhile exercise; if you are so interested, I recommend checking out the links that Hess provides on his initial etckeeper announcement. They are Git-centric, but apply equally well to other VCS choices; you can learn a lot by following the process that others have been down before.

Linux How-Tos and Linux Tutorials : Nonlinear Writing on Linux with Storybook

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Works of fiction come in all shapes and sizes, and I don't mean simply the format (book, short story, graphic novel). Some are action- or plot-driven, others are primarily internal, some are epic in scope, others stream-of-consciousness and barely involve a chronology at all. So when I heard about the release of a new version of Storybook, ostensibly a tool to help writers write fiction, I was curious as to precisely what sort of help it would offer. The answer is "keeping track of plot points when you have too many to juggle" — whether that is the kind of help you need depends largely on what you write, but even if your novel is ready for the page, this application puts some hurdles in your way.

Works of fiction come in all shapes and sizes, and I don't mean simply the format (book, short story, graphic novel). Some are action- or plot-driven, others are primarily internal, some are epic in scope, others stream-of-consciousness and barely involve a chronology at all. So when I heard about the release of a new version of Storybook, ostensibly a tool to help writers write fiction, I was curious as to precisely what sort of help it would offer. The answer is "keeping track of plot points when you have too many to juggle" — whether that is the kind of help you need depends largely on what you write, but even if your novel is ready for the page, this application puts some hurdles in your way.

Broad Strokes

At its core, Storybook is a tool that allows you to disassemble a story into its constituent parts: characters, locations, individual scenes, plot threads, etc. That way you can build the larger tale by working on whichever individual element you need to at the moment, without losing track of the overall structure or forgetting to follow up on a pivotal detail. You don't have to start at chapter one, in fact you can rearrange the pieces as often as you like.

Storybook is a Java application, with pre-built packages available for Linux and Windows. The latest release is version 3.1.0, which weighs in at just over 33 MB. The installer unpacks to an /opt/ folder in the user's home directory, which means you do not need root privileges to install it, and adds a Storybook.desktop launcher to the desktop. You can download two demo book projects from the main site, both of which are in German, but it is just as easy to skim through the tutorials and start up a new project of your own.Storybook

A relatively new wrinkle to the Storybook situation is the increasing divide between the free version of the application and Storybook Pro, the paid version that offers additional features. Some of these additional features are tools, such as the Memoria mind-mapping utility. But recent versions have started removing key features from the free edition of Storybook, including the ability to work while offline, and (most importantly) the ability to export your work to a standard document format.

The Plot Thickens

Weaving a tale in Storybook can be as simple as clicking the New Scene button on the toolbar and typing away. In this mode, Storybook automatically creates and numbers chapters and sub-chapter scenes for you. You can even split your book into larger divisions called "parts." But the application's main emphasis is on wrangling multiple plotlines at once. To do that, you will need to add plot elements to the database.

Storybook can keep track of characters, locations, plotlines (which it calls strands), and items. On its own, simply entering the name, occupation, and background info of characters is probably only of ancillary value to a writer — unless you are crafting a Tom Clancy-like tome of minor characters. The real benefit of entering this information comes in your ability to take views on the work as a whole, examining scenes in date-chronological order, or viewing which characters are where.

The more convoluted your tale, the easier it is to accidentally introduce an anachronism by having a character mention something in dialog that they "shouldn't" know yet. The ability to sort scenes by strand (color-coded) and interweave them over the course of the book is surely something that a budding LOST writer would benefit from.

That said, you cannot easily track themes or ideas, just concrete objects. Likewise, the properties expected of you for each element you track are similarly tangible: scenes must have dates, and characters have birthdays and occupations — not motivations or desires. In other words, you can write a definite, plot-driven book like Lord of The Rings using Storybook, but you certainly cannot write Ulysses.

However, even if your needs are a match to the feature set, there are two difficulties in making major use of Storybook's plotting tools. The first is the rapidly-shrinking feature set available in the free version of Storybook — this release removes the ability to view a character- or strand-centric timeline, character appearances by date or strand, and even basic reference lists of characters and locations. It doesn't help matters that the application is nag-ware, with a permanent "Upgrade Now!" banner in the lower-right corner of the screen, either.Another Shot of Storybook

The other problem is that Storybook still restricts you to a needlessly formal set of parameters for every story element that you want to track. Characters must have first names and must be either male or female; locations can only be grouped into cities and countries, etc. These categories are useful if you are writing an action-driven, real-world novel, certainly. But take the gender issue, for example; you cannot have aliens, characters whose gender is unknown to the other characters, or HAL from 2001 — and in reality, the only purpose that I can see to the gender assignment field is that it selects which of the two "person" icons to stick next to the character name in the list. Or consider the locations; the country and city categories are used to sort the locations into a collapsible tree-view widget. That comes in handy if your story is evenly-distributed around the globe. But if your entire book takes places in Springfield, Missouri or a tiny town in rural Minnesota, you get no sorting whatsoever.

The irritating thing is that both of these arbitrary restrictions are unnecessary; why can we not just add the locations we want, and group them into folders at will? If I need to keep track of "floor one" of my hotel and "floor two," rather than separate cities, it genuinely will not matter to the database underneath. Bruce Byfield reported on these and several other story element restrictions back in 2008. It is good to see that several of them are gone now (for instance, you can now assign relative dates to scenes, rather than strict calendar dates only), but all in all, too many sharp corners still remain.

A Missed Opportunity

There are minor bugs in Storybook 3.1.0 that do not affect your ability to write, but stick out. For example, the documentation warns you not to use "reserved" characters in your project name — a list that includes the forward-slash character and the question mark. On the one hand, if this restriction prevents anyone from penning a sequel to Face/Off, it would probably be worth it, but we would also lose Who's Afraid of Virginia Woolf? Ultimately, trapping reserved characters is something that any professional application needs to do for you, not a problem that it needs to place on your lap.

Still, my biggest disappointment with Storybook remains the removal of key functionality from the free version of the application. I do not begrudge anyone the right to try to make money off of their own software, but this is not a sustainable way to do it. Removing the ability to create the charts and reports that help you manage character and plot lines takes away sixty percent of what makes Storybook worth installing. Removing the ability to print, however, makes this an application not worth keeping around.

If the authors of Storybook wanted to grow a business out of their code, a better plan would have been to mimic what the creators of the storyboarding tool Celtx have done. Celtx offers paid support plans and cloud hosting for projects, both of which may actually be of value to professional users. But more importantly, it fosters a user community, with active discussion forums and even contributed add-ons. That is the real way to build a community of users willing to support your product.

In contrast, Storybook does not offer users a place to go to ask questions or discuss the application with each other, and support tickets are not even available for purchase by users of the free app. At best, that makes it difficult to explain why users would invest up front. At worst, it leaves the project open to a fork based on earlier versions of the source code before the feature removal got out of control. Either way, it is hard to picture a happy ending to the tale.

Linux How-Tos and Linux Tutorials : Weekend Project: Ensure a Hassle-Free Linux Upgrade

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Linux's long-term stability means that users can go for years simply upgrading packages without ever doing a re-install from scratch. Believe it or not, that is not always a good thing. It is the recommended practice for servers, naturally, but a peculiar side-effect is that when you do eventually re-install (a desktop or a server), you have ages of old tweaks and customizations built up, and reproducing them can be a confusing hassle. I recently undertook a from-scratch-reinstall, so some of the lessons I learned could be valuable when you tackle your next migration.

Linux's long-term stability means that users can go for years simply upgrading packages without ever doing a re-install from scratch. Believe it or not, that is not always a good thing. It is the recommended practice for servers, naturally, but a peculiar side-effect is that when you do eventually re-install (a desktop or a server), you have ages of old tweaks and customizations built up, and reproducing them can be a confusing hassle. I recently undertook a from-scratch-reinstall, so some of the lessons I learned could be valuable when you tackle your next migration.

To be precise, I started experiencing hard drive problems on one of my desktop machines (with the disk on which my root partition lived), and seeing the writing on the wall, I decided to replace the aging disk. A new release of that machine's distro had just dropped (Ubuntu), so a fresh install was not much worse than an upgrade plus migrating to a new disk. I also took the opportunity to install / to an SSD, which does not affect the process much, and to finally move that machine from 32-bit to 64-bit build, which does affect the process. It means I cannot simply copy binaries from one machine to another; instead I have to actually sort out what is installed on the old machine, and replicate it.

From the bird's eye view, migrating your settings and "presence" to a completely new machine means taking stock of everything locally-installed and customized at the system level (namely application packages), determining the stock package-set you need to replicate onto the new machine, isolating any specialty applications like databases, and preserving your configuration and settings (system-wide and personal). Obviously you need to migrate your actual files, too, but that is hardly an unsolved problem. For the moment we will just stick to the software and non-data portions of the OS.

Support your /local/ Hierarchy

The first thing to do is take inventory of all of the packages on your old machine that you cannot simply re-install through the new machine's package management service. For starters, this includes everything that you have compiled and installed locally. According to the filesystem hierarchy standard (FHS), locally-installed software belongs in /usr/local/bin/ and (for system administration programs, /usr/local/sbin/). You can take stock of the contents of those directories to make sure you don't forget something, however, if you built RPM or Debian packages from source rather than installing the programs with make install, the packages may be installed in the normal /usr/bin/ and /usr/sbin/ directories instead.

In that case, you will need to turn to the package manager for help. Apt front-ends like Synaptic can show you locally-installed packages for Debian-based systems (including Ubuntu and all of its derivatives). As near as I can tell, YUM does not yet implement a similar feature for RPM-based distributions, although it has been discussed.

You can glean similar information with a bit of elbow grease, though: on the YUM list, Tom Mitchell suggests running rpm -qia and looking for the "Build Host" field — a locally-compiled package should have your local machine's hostname. Mitchell suggested piping the output to less, but rpm -qia | grep -B 3 'yourhostname' | less will find only those packages that match your local machine's name. Hopefully you've picked an unusual one.

Finally, you should always be wary of accidentally overlooking proprietary applications — particularly when they come in installation shell scripts rather than standard packages. Independent software vendors have a tendency to drop binaries and configurations in peculiar places. Many go in /opt/, but there is no foolproof way to find them all. You can use either rpm -qf /some/particular/pathname or dpkg -S /some/file/name to discover what (if any) package a suspicious-looking file belongs to, but you are just as likely to be successful by reading through your "Applications" system menu and taxing your memory.

Manifest Destiny

Next up, you will want to generate a list of the installed packages on your old system, which you can use to replicate the installed-package-list on the new machine. This is a little easier on Debian-based systems, because dpkg has a built-in command for the purpose, but it is easy enough on RPM distros, too.

On a Debian system, run dpkg --get-selections > my-software-packages.txt. This will write a list of all installed packages on your system to the my-software-packages.txt file. All means all; not just the packages you chose, but all of the libraries, data packages, and other dependencies that they pull in as well. It will be a long list.

You can edit it by hand and remove things you do not care about, but be careful not to erase one line but leave a dependent package in elsewhere in the file; that would be a conflict. It is also up to you what to do about the locally-installed Debian packages (if any) discussed in the preceding section. The simple solution might be to uninstall them before generating the file, but that makes life harder if the migration takes longer than you expect. Excising the lines in question from a text editor is probably just as easy; most people don't have more than a handful of local packages.

On the new machine, you can flag the full list of packages by running dpkg --set-selections < my-software-packages.txt — then run dselect to start the installation, and enjoy a good book while the process churns.

RPM distro users can generate a comprehensive package list with rpm -qa > my-software-packages.txt. You also will need to consider polishing up the list before feeding it into the new machine's package manager; all of the same caveats apply. Once you are satisfied, however, run yum -y install $(cat my-software-packages.txt) on the new machine. Enjoy an equally-good book.

Files You Actually Want

I said in the introduction that I was not going to address migrating data files, since that is in some ways a simpler process — and because no two users have the same data, it is harder to generalize. However, there are some system-wide files you need to make special provisions for, such as content in /var/ and configuration files in /etc/.

Some of these files will potentially be different enough on the new machine that you will definitely not want to simply copy over the old file into the new /etc/ directory. For example, /etc/modprobe.d, or /etc/fstab and /etc/mdadm.conf. On the other hand, you may want to preserve some locally-honed configurations, such as any custom cron entries in /etc/cron.d/, LAN hostname information in /etc/hosts, network configuration in /etc/network/, or any firewall configuration you might have saved (such as in /etc/iptables.rules).

The tricky part is that Linux allows for so much flexibility in the naming and location of these configuration files that it is hard to write general-purpose rules. For example, you can store your firewall rules in /etc/iptables.rules, but you can just as easily keep them in /etc/network/if-pre-up.d/iptablesload. If you are not sure of the history of your customizations, the safe play is to make a copy of your /etc/ directory in a different location on the new machine, and resolve the differences one at a time. In the future, you might consider a configuration-version-control system like etckeeper for the new machine, so that the next migration will be better documented.

The /var/ directory is similarly inconsistent. Many game packages use /var/games/ to store settings and data that needs to be visible to multiple users — forget to migrate it, and the worst that can happen is you lose all of your high scores. On the other hand, /var/www/ is historically where web applications are installed, and you probably do not want to move to a new machine without them. You may want to re-install local web utilities on the new machine if you are doing an upgrade at the same time — that depends on whether new versions of the scripting engines and libraries used are coming with the upgrade. Even if you do not, however, you probably need to preserve the data in the web apps.

In any event, this is another topic where you will have to devise a strategy on a case-by-case basis. Straightforward web utilities like the CUPS web front-end or phpMyAdmin you should have no trouble reinstalling on the new machine. A web-based accounting package that you use every day, however, needs to be treated carefully to avoid data loss.

All Your Data Are Belong to Base

Local web content leads into another important subject: databases. Migrating databases from one machine to another requires consideration of the database system used, the storage format, and the architectures involved. As with /var/www/, there is an important distinction between applications where you want to start over on the new machine because the context is different (such as Webmin), and applications where you need to migrate your data (such as a GIS app).

Fortunately, the need arises often enough that a lot of RDBMS-backed applications will bundle their own database-migration tools, just as they do backup tools. MythTV, for example, includes backup and restoration scripts that can be used to migrate from one machine to another without a loss in service.

Other apps might not be so generous, in which case you should start by consulting your database's documentation. MySQL has extensive online docs dedicated to migrating databases between machines. There is a good chance that your old a new desktop machines will use the same binary storage format (most PC architectures do), in which case the procedure is straightforward: you can copy the underlying .frm, .MYI, and .MYD files from one machine to another — provided that you copy the mysql database itself, and replicate the MySQL users to the new machine.

Of course, you'll need to find the aforementioned MySQL binary files before you can copy them. If you need to locate yours, run grep datadir /etc/my.cnf. Even for very large databases, this direct copy method is faster than the old-fashioned migration process, in which you dump the database on the old machine, transfer the dump to the new machine, and import it into a new database. If you happen to use a weird, non-default storage engine, you will need to do a little extra digging, but for most users, the migration process is painless. Just be sure to run a few tests on the new machine before you erase the old one.

Sweeping Out The Corners

There are likely to be peculiarities on any system that won't get caught by the general-purpose migration hints, but if you have the luxury to migrate to your new machine (or simply your new root partition and filesystem) while your old one is around, you can catch most of them within a few weeks' time running the new system. That list might include locally-installed packages that did not include dropping any executables into /usr/local/bin/ and /usr/local/sbin/ (such as window manager or icon themes), Java applications that ended up in unusual directories, and programs that use peculiar locations to save their data and settings (such as /usr/share/ instead of /var/ — the fact that they shouldn't doesn't guarantee that they won't).

Naturally, even after you manage to replicate the software environment and system configuration from your old machine onto your new one, you still have to move your personal files. I have been assuming thus far that this content is either stored in one location (within /home/), or else on other mount points that you know well enough to re-attach properly. But when your migration also happens at the same time as a system upgrade (as my example did), simply mounting the old /home/ on the new machine can introduce its own share of problems because of deprecated and upgraded settings in the desktop environment and applications.

Consequently, it may be better to rename old configuration folders like .gconfd and .kde4 and use them only as reference when tweaking the new environment. The good news is that older apps and lower-level utilities are more stable than the desktop environments and far less likely to introduce API changes. So dependable favorites like .bashrc, .vimrc, .emacs.d/, and .ssh/ are liable to sail through any upgrade process, no matter how prickly, and be waiting for you on the other side.

Linux How-Tos and Linux Tutorials : UPnP with People: Open Source Media Sharing with MediaTomb

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

If you buy a "smart" TV or DVR these days, the odds are that it will support UPnP for hassle-free media sharing. Thus devices like TiVos can locate each other and share recordings across the LAN, and a lot of Linux music and video players support connecting to UPnP shares as well. Give it a moment's thought, however, and you'll probably ask yourself "does it need a specialized device, or can I turn my existing Linux box into a full-blown UPnP server for my existing content?" Well, good news: not only can you, but it is pretty painless to do.

If you buy a "smart" TV or DVR these days, the odds are that it will support UPnP for hassle-free media sharing. Thus devices like TiVos can locate each other and share recordings across the LAN, and a lot of Linux music and video players support connecting to UPnP shares as well. Give it a moment's thought, however, and you'll probably ask yourself "does it need a specialized device, or can I turn my existing Linux box into a full-blown UPnP server for my existing content?" Well, good news: not only can you, but it is pretty painless to do.

UPnP stands for "Universal Plug and Play," and encompasses a family of network protocols that PCs and embedded devices like set-top boxes can use to discover each other and share services. The low-level parts of the stack are familiar territory: DHCP for network configuration, a simple HTTP-like service discovery protocol, event notifications, and so on. The media-sharing functionality is technically an extension to UPnP called UPnP AV, and is a recommendation of the industry trade group Digital Living Network Alliance (DNLA), which you may have seen in advertisements.

UPnP AV defines "media servers" and "control points"/"renderers" that roughly correspond to masters and clients in a typical client-server relationship. A renderer could be a dumb device like a TV monitor, but most devices offer both control point and rendering functionality. To make your media accessible everywhere, you will need to run a UPnP AV media server package on the machine where your content is stored, and use your client devices (or applications) to connect to it as control points.

For Linux users, there are several packages to choose from. They vary chiefly on whether or not they support audio, video, and still images (or some subset of the three), how they are administered, and whether or not they can perform on-the-fly transcoding. Transcoding is an important feature because many UPnP control points support a fixed set of codecs. You may have your videos in WebM and your audio in Vorbis or FLAC, but some of your UPnP devices — say, a home theater receiver — only support H.264 and MP3.

Some media center applications come with at least partial UPnP support built-in. MythTV and XBMC can both act as media server and pipe out content to UPnP control points, although they must be running in order to do so. This is not a problem for MythTV because the UPnP server is part of the back-end process, but XBMC is a GUI application that you probably don't want to run day and night. On the other hand, XBMC is trivial to set up, while MythTV requires a significant time investment.

Enter the Tomb

My favorite option for the moment is MediaTomb, a GPL-licensed package that supports audio, video, and stills, has flexible on-the-fly transcoding support, and is easily administered through a web-based front end. The latest release is 0.21.1, which is from 2010, but there are not any major pieces of functionality missing, so you should be good to go until there is a significant UPnP specification update.

You can download the source code from the project site, but check to see if your distribution provides a package first. The developers supply packages for Fedora, Debian, Ubuntu, Gentoo, Mandriva, openSUSE, and several other distributions (even some BSD options!), which should cover just about everyone.

MediaTomb UI

 

To set it up, first ensure that UDP port 1900 is not blocked by your machine's firewall, and that any ports you specify in the configuration file /etc/mediatomb/config.xml are free. MediaTomb will happily select a high-numbered available port (starting with 49152) to run on whenever the process is started, and UPnP's service-discovery mechanism makes the choice transparent to the control points, but you might desire a static port for convenience. UDP port 1900 is UPnP's service discovery protocol's reserved port.

MediaTomb will usually run without requiring modifications to its configuration file, but there are a few settings worth double-checking. For example, by default the application uses SQLite for storage; if you prefer, you can change this to MySQL by changing the enable="no" to a "yes" in the MySQL directive. Of course, you will then need to add the appropriate MySQL user and password credentials to the configuration file to match. You also need to set transcoding to "yes" if any of your clients will need it.

You will definitely want to set ui enabled="yes" to switch on the web administration front end, but you can safely turn it back off after you have everything configured and running to your satisfaction. You can also enable or disable accounts (to prevent unauthorized access), but be aware that the login connection is not encrypted, so the password data is only as secure as your network. Lastly, there are several specific UPnP devices that require special quirk-mode support for the ways in which they behave different from the vanilla protocol. These include the Playstation 3 and several D-Link players — there are documented lines in config.xml for each; all you need to do is uncomment them.

On the Air

MediaTomb registers as a service that can be started, stopped, or restarted using the standard Linux init system. On Debian-like OSes, this means sudo /etc/init.d/mediatomb start|restart|stop. Once you have the general configuration details in place, you can either reboot or start the service, then proceed to the web UI.

Most importantly, you will need to tell MediaTomb where your content is. Be sure you organize your material in a convenient manner — big directories with 10,000 items in them are not easily browsed, particularly on hardware devices, and you may want to make sure you separate home and work-related material (or any other such semantic distinctions) before you make all of it accessible to the network.

You can access the web UI most easily from a browser on the same machine that is running MediaTomb; visit http://localhost:49152 (or substitute the appropriate port number if you changed the default setting in the config file). You can dig into the documentation and bind the UI to a specific IP address so that it will be accessible from other machines, but this does not change its functionality at all.

What you will see is a two-pane window with a tree-style browser widget on the left and an empty space to the right. At the top, you can click to alternate between "Database" and "Filesystem" views. To add files, simple choose Filesystem, then navigate into the filesystem hierarchy on the left. When you reach the folder that you wish to add, the right-hand pane will display the folder's contents.

You can add individual files by clicking the "+" next to their name, or add the entire directory by clicking the "+" at the top of the pane. The button next to it shows a plus encircled by "recycle arrows" — clicking this instead will tell MediaTomb to add the directory and to watch it for changes. You can set the re-scan parameters to fire on any inode changes, or to periodically rescan by elapsed time.

When you add a directory, MediaTomb will scan it and add all of the metadata it finds to its internal database. Directory adds are recursive, so this could take a long time for a large library. When you click back over to the Database view, however, you can see the fruits of its labor: your content is cataloged and organized, with separate views available for Artists, Albums, Genres, and Years for audio, and directory views for photos and video. The various views are what MediaTomb serves up to clients when they connect an ask for content. Consequently, you might want to edit the details associated with particular items to make sure they appear correctly. You can do this by clicking on the pencil-and-paper icon next to each item in the Database view.

Extra Credit

At this point, MediaTomb serves up your content to any UPnP control points or displays without intervention. If you set your directories to auto-scan, you can disable the UI and never worry about it until you replace a hard disk. In fact, it is probably a good idea to turn off the UI, because the filesystem browser, if left running, would let anyone with access to the server take a peak at your entire file structure (although it would not let them see file contents).

The next task on the path to UPnP mastery is transcoding. The config.xml file contains a <transcoding> stanza that defines several "profile mappings" — instructions to report certain problematic file types to connecting clients as if they were something else, and a conversion utility to transcode them to the output format on the fly. UPnP clients should be able to see unplayable content, simply rendering incompatible formats as grayed-out entries, but not everyone is up to snuff, so MediaTomb has to provide some workarounds. The UPnP specification does permit the server to report more than one format for each entry, but dumb enough clients may not be able to cope with more than one entry per title. Thus the config file has to include logic to always list the auto-transcoded version of each title first, so that less capable clients will see it and be able to play it back.

Exactly how you set up your transcoding settings is of course up to you. If you store your media in a lossless format like FLAC, then you lose nothing by transcoding everything on-the-fly to all clients. On the other hand, transcoding from Vorbis to MP3 results in a quality degradation due to the double-lossy-encoding problem. The only real problem is deciding how to cope with multiple control point clients that have different codec capabilities. If you just use a single client, you have nothing to worry about. MediaTomb is also flexible enough to support just about any encoder as the transcoding helper application; for most media it uses the excellent VLC.

You have alternatives for LAN-based media sharing: NFS or Samba, DAAP, etc. — and although UPnP will let devices around the house access your files, it will not help you with other media tasks like synchronizing portable music players. The protocol's best feature is the ease with which you can make content available to "smart" devices over which you do not have direct control — like TVs and stereos. Yes, the lack of control that comes with consumer DNLA products is lamentable, but this is yet another way where Linux and open source can help bridge the gap.

Linux How-Tos and Linux Tutorials : Intel’s Dirk Hohndel on 20 Years of Linux

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Intel's Dirk Hohndel presented a retrospective on the history of the Linux kernel Thursday at LinuxCon Europe in Prague. Although the 20-year anniversary of Linux has been addressed many times over the course of 2011, Hohndel took his own approach, a personal perspective from a developer present at the beginning of the process and has stayed involved, in one way or other, since.

Intel's Dirk Hohndel presented a retrospective on the history of the Linux kernel Thursday at LinuxCon Europe in Prague. Although the 20-year anniversary of Linux has been addressed many times over the course of 2011, Hohndel took his own approach, a personal perspective from a developer present at the beginning of the process and has stayed involved, in one way or other, since.

Hohndel was one of the earliest kernel contributors, and said that he wanted to present his take on the history of the project to provide a perspective that was not focused on the growth of Linux adoption, because for the founders of the kernel, it's primary appeal was as a technical challenge. World domination was an afterthought. In addition, he said, the core kernel team's continued focus on the "next technical hurdle" over the years is actually one of Linux's strengths. That is, they work on the kernel for its own sake. If it wasn't fun for them, it likely wouldn't be a platform for success for anyone else.

In the Beginning…

At the very beginning, Hohndel said, the idea of a new Unix-like kernel project was not novel. "Everybody" was writing their own OS, or so it seemed, with the sudden availability of affordable PC hardware and the GNU utilities, it was easy to get started on such a project for the first time. But Linux was different from the others, he added, because the other hobbyists wanted to retain control over "their" kernels — you were welcome to look, but don't touch, was the message.

Linus was the opposite: he asked for feedback, welcomed patches, and also chose the GPL license that made collaboration over the Internet with remote colleagues possible. In the early days, Linus himself was the number one support provider for Linux — anyone who had a problem could email him, and he would respond.

The timing of the start of the project was critical, but a few other events marked "inflection points" (not quite "turning points") for its future growth. The first of these was the rise of Linux distribution companies in the mid 1990s — they showed for the first time that people could actually make money from Linux and open source software. Primarily they were selling to the inside crowd at first, but it was an important milestone. On the other hand, it also had its downside, in that the distributions were in competition with each other, and their business offices provided the first pressure to differentiate their offering from the others, bringing the first cases of fragmentation to the community.

Hohndel added that he openly accepted responsibility for some of this, having spent time at SUSE. He only wanted to observe that the rise of distributions changed the way the community operated for the first — though definitely not the last — time. That era in Linux also saw the first people leaving the kernel community, as subsystem maintainers left to pursue other projects for the first time.

Avoiding Fragmentation

Linus did a good job of preventing fragmentation from actually splitting the community, he said, in particular preserving the idea that there would be only one kernel. The first embedded Linux systems and the first high-end server users were pulling the project in different directions, but Linus showed that maintaining a single kernel for all of them was not only possible, but better.

The next major inflection point for Linux was the arrival at the end of the 1990s of the first proprietary software that ran on top of Linux distributions. Investment from major IT industry players followed quickly, because for the first time non-Linux companies were beginning to rely on Linux to support their own business. An outgrowth of this era was that kernel development became a profession for developers who had previously done it just for the challenge. From that change came long-term vision for the future of the kernel.

The dot-com bubble (and subsequent bust) were both fortuitous events for Linux, Hohndel argued. The bubble saw start-up companies pour unprecedented amounts of capital into the infrastructure that supported Linux. A lot of venture capital was wasted in those days, he admitted, but the investment meant that businesses had to take Linux seriously. In the bust that followed, and in the 2008 financial crisis, businesses did turn to Linux, out of necessity. By that time, of course, it was more than ready.

Looking forward, Hohndel concluded, there has been talk (even at LinuxCon Europe) speculating on whether or not the "graying" of the core kernel team is good or bad. Hohndel (perhaps unsurprisingly) sees it as good. First, having the same team of subsystem maintainers for several years running has provided Linux with a level of stability over the long term that most commercial products cannot match. But second, it also indicates that the kernel team still enjoys what it does — meeting the next technical challenge, and doing it in an open, "by the geeks, for the geeks" manner.

Linux How-Tos and Linux Tutorials : The Kernel Panel at LinuxCon Europe

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Linux users got a rare opportunity to hear directly from the hackers at the core of the Linux kernel on Wednesday at LinuxCon Europe. Read on for more on the state of ARM in the Linux kernel, the need for new kernel contributors, and the death of the Big Kernel Lock (BKL).

Linux users got a rare opportunity to hear directly from the hackers at the core of the Linux kernel on Wednesday at LinuxCon Europe. Read on for more on the state of ARM in the Linux kernel, the need for new kernel contributors, and the death of the Big Kernel Lock (BKL).

Linus Torvalds and other kernel developers sat down for a question and answer session at the first LinuxCon Europe. Lennart Poettering, creator of PulseAudio and systemd, served as moderator for the panel, which consisted of Torvalds, Alan Cox, Thomas Gleixner, and Paul McKenney. The four took prepared questions from Poettering, as well as responding to impromptu audience member questions on every topic from version numbers to the future of the kernel project itself.

The panelists introduced themselves first. Torvalds, of course, is the leader of the project. Cox works primarily in system-on-chip these days (although he has had other roles in the past). Gleixner maintains the real-time patches. McKenney works on the read-copy-update (RCU) mechanism. Together they account for just a tiny fraction of the kernel community, but by their roles and experience offer keen insights into the health of the kernel, the health of the kernel community, and the directions it is heading.

Poettering opened by asking the group about a frequently-quoted comment by Torvalds that breaking userspace compatibility was something that had to be avoided at all costs. A nice sentiment, Poettering said, but one that sounds hypocritical considering how often the kernel team really does break compatibility with its releases — sometimes even with trivial changes like the switch from 2.6.x version numbers to 3.0 that happened earlier this year.

Torvalds pointed out that any program which assumed that the kernel's version number would always start with a "2" was broken already, but that the kernel team had also added a "compatibility" mode that would report its version number as 2.6.40 if buggy programs needed it. That encapsulated the kernel team's approach: add something new, but do everything possible to make sure that the old way of doing things continued to work.

Torvalds added that he used to keep ancient binaries around — including the very first shells written for Linux, which used APIs that were deprecated within months — to test against each new kernel release, just to make sure that old code continued to run. API stability is important, he said, but it flows out of not breaking the user experience. "No project is more important than the users of that project," he summarized.

Next, Poettering asked if there was an aging problem with the core kernel development team, noting that the average age of the subsystem maintainers was growing. Torvalds said no, but that it sometimes seemed like it simply because it takes time for a new contributor to "graduate" from maintaining a single driver to managing a set, and eventually to managing an entire subsystem. The others agreed; Cox added that there was plenty of "fresh blood" in the project, in fact more than enough, but there was a bigger problem in the gender gap — a problem that no one seemed able to fix, despite years of trying. Most of the female kernel contributors today work for commercial vendors, he said; with very few participating because of their own hobbyist interest. Torvalds added that another reason it often seems like the kernel crowd is aging rapidly is how ridiculously young they were when they started — he was only 20 himself, and several other key contributors were still teenagers.

Audience members asked questions from microphones placed at the edge of the stage, and several had questions about specific features: the Big Kernel Lock (BKL), the complexity of the ARM tree, and whether or not embedded Linux developers were given as much attention as developers working on desktop and server platforms. Cox reminisced about the BKL, which he called the right solution for the early days of multi-processor support in Linux, even though it had subsequently taken years to replace with more sophisticated methods. It was always a nuisance, he said, but it got Linux SMP support much faster than other OSes, such as the BSDs.

The ARM architecture was controversial in recent months, after Torvalds had to resort to tough talk to get the ARM family to clean up its code and standardize more. The situation is much better today, Torvalds reported. The problem, he said, is that while ARM has a standardized instruction set for the processor, every ARM board has a different approach to other important things like timers and interrupts. Intel had never faced such a glut of incompatible standards for the x86 architecture because the PC platform was so uniform, so it has taken a while for ARM to see the need to take a more active approach towards standardization. Torvalds also said that for the most part the kernel team is very interested in embedded development; what gets tricky is that most embedded Linux devices are designed to be built once and never upgraded. That makes it harder to do testing and ongoing kernel support for embedded platforms.

When asked about the challenges facing the Linux kernel over the next few years, McKenney cited a number of research topics facing all operating systems: scalability, real-time, and validation, to name a few. Torvalds said maintaining the right balance between complexity and the ability to get things done. The sheer amount of new hardware that comes out every year is overwhelming, he said; keeping up with it is a practical (though not theoretical) challenge for the team.

Speaking of the practical, one audience member asked Torvalds what his process was when getting a new pull request from a contributor. "I manage by trusting people," he replied. Whenever a pull request comes in, he looks at the person who sent it. Depending on the person, his process varies: some people have earned enough trust over the years that he believes in their judgment, while others have their own recurring issues that mandate additional review.

In any case, he said, he makes sure that the new code compiles on his own machine because he "hates it" when he can't compile for his own configuration. But for the most part, he said, his role is no longer to validate individual pieces of code, but to orchestrate the work of others. If he knows two people are working in a similar part of the kernel, he needs to be aware of it to avert clashes, but he trusts the maintainers of their individual subsections. That trust is given to the person, he reiterated; the individual earns it, not the company that the individual might work for.

The panel touched on other areas as well, including security, cgroups, and subsystem bloat. In each case, was comes across in a panel discussion such as this is how human the process of writing and maintaining the kernel is. The kernel team can make mistakes, and they have to route around them with bugfixes in subsequent releases. Maintainers may not be interested in a particular area of development, but they will look at and even integrate the patches because they are important to a subset of developers or kernel users.

The kernel may have steep technical challenges, but just as real a threat to productivity is burnout among maintainers. It is fun to watch the kernel team wisecrack and comment on stage, but it is also a healthy reminder that above all else, Linux is a collaborative project, not simply lines of code.

Linux How-Tos and Linux Tutorials : A Look at (Or a Listen To) Banshee 2.2

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

The Banshee released version 2.2 recently. Since it had been a while since I last explored Banshee on the desktop (although I have used the MeeGo flavor on a netbook), I decided to take a look — or a listen, to be more precise. On the surface, the app does not depart much from the iTunes-clone approach taken by essentially every other music app in the open source ecosystem. But Banshee is well-designed and has the potential to forge ahead in some interesting new directions.

The Banshee released version 2.2 recently. Since it had been a while since I last explored Banshee on the desktop (although I have used the MeeGo flavor on a netbook), I decided to take a look — or a listen, to be more precise. On the surface, the app does not depart much from the iTunes-clone approach taken by essentially every other music app in the open source ecosystem. But Banshee is well-designed and has the potential to forge ahead in some interesting new directions.

You can download Banshee from the project's Web site, with stable builds available for openSUSE, Ubuntu, Fedora, Debian, Mandriva, and Foresight Linux, not to mention Mac OS X, Windows, and source code bundles. Linux users should have no trouble simply grabbing the latest copy (or adding the relevant package repository). Mac and Windows users will have a tougher experience, because the Mono app framework — one of Banshee's dependencies — is probably not installed by default. Have no fear, the instructions for every OS are specific and clear.

You may also want to look at the extensions provided for Banshee. They extend the application's functionality in a variety of ways. Some are what I consider fluff — visualizations, lyrics, plug-ins that search YouTube, and other tangential activities that do not really alter the core listening experience. They're fine if you like that sort of thing. But more important are the extensions that hook Banshee into other media sources or enable support for new formats. The main suite of extensions is packed into one bundle called the "Banshee Community Extensions," which is supplied as its own package by most modern Linux distributions.

When launched, Banshee uses a thin playback-control strip along the top of the window, and a sidebar on the left that gives you one-click access to the supported media sources. Whichever one you are using at the moment takes up the rest of the window. Naturally, most people are expected to spend the bulk of their time in the "Music" source underneath "Libraries" — this is your collection of audio files, which you can browse and sort ad infinitum. Banshee also includes your playlists under the Music heading, which I find very limiting. The "smart playlists" for recent additions and favorites are nice, but if you like compiling custom playlists you will rapidly run out of room in the sidebar, and you will find that using names longer than about twelve characters results in them being cut off with an ellipsis. Also found under the "Libraries" heading are podcasts, videos, and audiobooks (about which we'll say more later).

Beneath "Libraries" is the other main source category, "Online Media." This includes all of the web music stores, online storage services, and podcast or streaming audio channel directories that Banshee or its extensions hook into. It also includes a sub-category for "Shared Music" which includes network services like DAAP servers and UPnP shares. One of the nicest things about Banshee is that it automatically discovers and lists reachable media servers. I can't tell you how annoying it is to have to walk through a multi-step process to add a networked server in some other media players, particularly when they insist that you supply the details of the protocols used as step one. I actually care about media playing software, and I can barely tell you the differences between DAAP and UPnP — how can anyone expect a casual user to do the heavy lifting?

Unless you have never listened to music before, you already have your digital audio collection saved somewhere on disk. Banshee lets you "import" your collection by specifying its location in the filesystem, and it will subsequently watch the folders you choose for new or modified content. It also has a flexible CD audio ripping tool built in, as well as one of my favorite features: a built-in "metadata fixer." This utility lets you locate duplicate Artists, Albums, and Genres, and merge them by correcting the relevant labels embedded in the metadata tags.

Welcome to 2.2

Speaking of metadata woes, one of the new release's key features is duplicate song detection, which is an optional extension. It is not quite as simple as detecting duplicates in artist and album names, because it relies on matching multiple metadata entries, but it is certainly more useful. I can live with the occasional "Of/"of" capitalization difference between various tracks, but actually having duplicate copies of a file is a waste of space. Also new among the 2.2 extension pack is AlbumArtWriter. This extension saves the automatically-fetched album cover images that Banshee looks up online into the album folders. The upshot is that you do not have to manually look up and save images separately, and they will be accessible to any other media program.

Banshee 2.2

On the playback front, 2.2 now supports SPC700 files (a format used for video game music), default equalizer settings, and support for XSPF playlists — a way to pre-populate streaming media stations from Icecast or other servers. There is also a "mini-mode" display option, which is good news for those who listen to music all day long while trying to get work done. Banshee offers a lot of features, but in its default mode it also likes taking up a bulky percentage of your screen space.

The connectivity features of the new release may be of more practical value (except for SPC700 collectors, obviously). This version adds synchronization support for a long list of new hardware, including the Notion Ink Android tablet and the Barnes & Noble Nook Color. Most mobile smartphones are already supported, because at the very worst they can operate as USB Mass Storage devices when plugged in to the computer, but devices like tablets and e-readers are not nearly as standardized.

There is also new support for the eMusic service and store. Like other supported online music stores (such as Ubuntu's or Amazon's), you can browse the Web site to look for tracks and make purchases, and Banshee will download your new acquisitions. But it can also re-download all of your old purchases using the stores' implementation of "digital lockers," which is a boon for people with multiple machines or who have suffered hard drive failure or loss.

In addition to the new primary features, there are a lot of minor improvements, including several fixes to the way notification messages, application indicators, and sound menu controls work in GNOME 3 and Ubuntu. It is a relief that Banshee supports both desktop environments, since it looks like both are here for the foreseeable future.

On the development front, there are lots of intriguing new projects on the horizon. UPnP support, which at the moment is implemented as an extension, may become part of the default code base. Even better is Banshee's migration to an all-GStreamer playback engine under the hood.

Not All Cookies and Rainbows

There is a lot to like about Banshee 2.2, but it still has several important weaknesses you need to be aware of going in.

First and foremost — and it pains me to say this — but I think that audiobook support has actually gotten worse with time. Banshee is one of the only audio players that attempts to let you manage your audiobooks separately from your music, and that is a good thing. You never want to turn on shuffle mode and catch a randomly-selected chapter from the middle of a book in between songs. But Banshee's audiobook module is just plain broken.

The interface is there: a separate link under "Libraries" and separate preferences for storing your collection, but it doesn't work. The only way to import audiobook files to is to import them into the music library first, then manually drag each file to the Audiobooks category — after which they may or may not still live on in your music library. But the Audiobook browser is nonfunctional, bookmarks do not work at all, individual chapters are not recognized as coming from the same title, and so on and so on. In previous versions, you could at least set the audiobook preferences to watch a separate folder for new material; in 2.2 you can apparently still set the preference, but it has zero effect.

The thing that makes all of this breakage worthy of criticism is the fact that audiobook support is not an add-on provided by the community: it is a shipping component in the default builds. Audiobook support is even mentioned as one of the top bullet points on the Features page. Banshee does such a good job organizing music and podcasts that the fact that audiobooks are broken and the project knows it but still advertises it is an embarrassment. Fix it or turn it off, but leaving it as is is inexcusable.

On a related note, the project makes a big deal out of the extension mechanism, but it does not do much (if anything) to maintain quality control. Sure, any particular extension for any application can be good or bad; what I mean is that there is no compatibility checking, no review, and no update mechanism. That makes every extension a crap shoot. There are some extensions that I really like in theory (such as the Internet Archive extension that hooks into old radio series), but periodically something changes in Banshee and they no longer work.

Without a repository or a feedback mechanism, there is really nothing you can do. The Internet Archive extension, for example, no longer has a working UI: the columns are fixed-width and unreadably narrow, and it picks up the wrong theme colors from GTK, leaving some of us with white-on-light-gray text. Banshee is right to aim for building an extension ecosystem — it just needs to take some cues from Mozilla about the proper way to manage one.

All in all, I like Banshee's approach to audio management: it does a much better job of treating disparate types of content in ways that are appropriate to the content type. In contrast, too many audio players lump local music files, podcasts, and streaming media into one amorphous library, or try to shoehorn adding a DAAP music share into the same workflow used for adding a new album folder.

Banshee also does a good job at taking care of automatable tasks for you (the album art writer and automatic duplicate-detectors, for example). I hold out hope that future releases will fix the outstanding bugs with audiobooks and extensions — then we can talk about other important usability issues, like the program's penchant for calling streaming media services "radio." But just about every audio player does that….

Linux How-Tos and Linux Tutorials : Project Gutenberg and You: Using Open Source to Contribute to PG

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Michael S. Hart passed away in early September — you might not know his name, but you certainly know his work. Hart founded Project Gutenberg, the oldest and arguably the best free e-book library in the world, home to tens of thousands of titles in dozens of languages, all contributed by volunteers. Project Gutenberg is no doubt going to continue to thrive, but as a tribute to Hart, let's take a look at the free software tools you can use to participate, from scanning in an old book to helping the Project's ongoing proofreading and formatting work — and sharing PG books with others.

Michael S. Hart passed away in early September — you might not know his name, but you certainly know his work. Hart founded Project Gutenberg, the oldest and arguably the best free e-book library in the world, home to tens of thousands of titles in dozens of languages, all contributed by volunteers. Project Gutenberg is no doubt going to continue to thrive, but as a tribute to Hart, let's take a look at the free software tools you can use to participate, from scanning in an old book to helping the Project's ongoing proofreading and formatting work — and sharing PG books with others.

Just as a refresher, Project Gutenberg's (PG) ebook library consists of works that are out in the public domain in the United States (where PG is based). With a few peculiar exceptions, that public domain status means they have been previously published long ago, and PG requires that new ebook entries be "cleared" — usually by verifying the date on an actual old copy of the book in question.

The practical upshot is that PG titles all originate as scans of old, printed volumes, which then get run through optical character recognition (OCR) to convert the images to text, and human proofreading, formatting, markup, and on some titles, translation. As you can imagine, that takes a lot of software.

OCR: Teaching the Computer to Read

OCR is always a tricky proposition: it requires computer vision, natural language, and a host of other disciplines working in tandem to pick out letters from an image file with any degree of reliability. There are several high-quality open OCR engines, most of which are designed to function as either libraries or CLI tools. In practice, you will want to scan images from a book using a GUI tool (to correct exposure, alignment, and everything else visually-identifiable) that can call on one of the CLI OCR applications to perform the text conversion.

The current best-of-breed OCR engine is Tesseract, which was created at HP and then given over to Google, who released it as open source in 2008. Recent builds have added more scripts and languages to the corpus of what Tesseract can recognize, and better support for more image file formats. Other engines you may want to install include CuneiForm, Ocrad, and GOCR. It is probably a good idea to install all of the engines provided by your distro's package management system; they may vary in performance from text to text (or font to font). If you have them all installed you can process a couple of test pages before proceeding to batch-convert large sections of a book.

As for the GUI front-end and scanning software, OCRFeeder is the place to start. A GTK+-based application, it can scan directly from a scanner, or import images you scanned in another app (such as Skanlite, XSane, or Simple Scan). OCRFeeder can use any of the OCR back ends mentioned above, and can even do document-layout-recognition, which is necessary for multi-column texts and books with illustrations.

Once The Text Is In, Making Sense Of It

Even if your OCR output is 99% accurate, a human being must still proofread the electronic copy of the text to find where that troublesome one percent is hiding, and fix it. PG has two main approaches, the go-it-alone method, and the curated Distributed Proofreaders (DP) project. DP is definitely the more rigorous, and accounts for a large percentage of PG's new ebooks, but for the sake of comparison, let's consider both — the proofreading and markup steps could prove important to any ebook project you work on, PG-bound or private.

If you take the go-it-alone route, you will need to proof your text, both for internal spelling and spacing problem (at least those of the kind spawned by OCR; you should not attempt to modernize the spelling of a very old book). Pulling the text into a word processor (or text editor) with spell-check for the language in question is a good idea, but PG fans have also developed an OCR-specific checking program called Gutcheck that may be a superior choice for your first pass. OCR tends to make different mistakes than human typists, so word processors' spell-checking often do not catch them.

One of PG's principles is that books can best be preserved by using the simplest, most compatible file formats available, so all titles are made available as plain ASCII text (or in a suitably simple encoding that captures the correct accent marks if ASCII does not). But most readers prefer to use HTML. Many text editors can output simply-formatted text as HTML with zero effort, but if the peculiar formatting of your book makes that a problem, you can use an ebook editor like Sigil to clean up the output.

PG also recommends that all HTML be passed through the official W3C HTML Validator to find errors. You might also want to use GutenMark (a CLI app) or its graphical front-end GUItenMark to convert your ebook from plain text to nicely-formatted HTML.

These steps give you a rough idea of what it takes to polish a text for inclusion in the PG library, but a far better approach than doing it all yourself is to participate in the Distributed Proofreaders process, which leverages a large community of dedicated volunteers and has all of the kinks in the process smoothed out. DP breaks the proofreading and correction process into discrete "rounds," and offers a slick web-based tool for volunteers to do proofing one page at a time.

A volunteer manager oversees each book project, ensuring consistency. DP has proofing and formatting guides for the volunteers. The DP site has detailed instructions on how to get started; you can get a good feel for the workflow by reading the FAQ and even start flagging mistakes as an unregistered "smooth reader."

But Where Do the Books Come From?

All technical issues aside, getting a new book approved for inclusion in the PG library or DP project is an important part of the process that cannot be hurried through. It is important to both projects that quality controls be followed to verify the copyright status of book projects, and to make sure that two people do not start identical projects to digitize the same title.

Both projects provide guidelines to help. You can easily search for a book title and make sure that it is not already in the library, but if it is not, you should still contact the project to begin the copyright (and de-duplication) clearance process before you begin. PG explains its rules and provides contact information on its Copyright How-to page, and links to an online tool to help you verify a book's copyright status. DP also has a good landing page outlining how you should proceed to propose a new book project; it is called the Content Provider's FAQ.

David's In-Progress List keeps track of the ongoing book projects, which makes it easy to check your proposal against works that are not-quite-yet in the library. DP also maintains a list of partially-finished books that are missing pages — if you have a copy and can provide a scan or text of the missing page, you can help out a great deal. Finally, DP runs a web discussion forum about content sourcing, which is a great place to get the most up-to-date information and pick up tips on how to proceed.

But Wait There's More!

Project Gutenberg has amassed an astonishing collection of literature, a lot of which is increasingly hard-to-find in print. But its influence goes further than that, with ripples that have created other projects also promoting literacy and open access to content. You might find one of them worth volunteering at as well.

PG has an official effort to burn and distribute periodic ebook collections on CD and DVD media, to help those who do not have constant Internet access. There is a separate PG effort underway to digitize sheet music — mostly classical chamber music, but a variety of styles and composers. There is a lot of overlap with PG ebook methods, but sheet music has its own special challenges, so volunteers with musical expertise are in great demand.

A close cousin to PG's ebook library is its audio books effort. There are several sources for audio recordings, most notably the Librivox project, where human volunteers record works in their own voice. For a while PG was also adding computer-generated recordings, but due to a variety or technical problems, they are rarely as good as a human voice. The collection of PG audio books is far smaller than the electronic text library, so help is appreciated.

Finally, it is important to remember that PG is based in the United States, and focuses its efforts on works that are public domain in that jurisdiction. Other countries have different copyright laws, making the copyright status of recent internationally-published works difficult to verify. There are now PG affiliated projects in several other countries, including Canada, Australia, Germany, and Norway. The national affiliate projects often focus on works in the native language(s) of the region, in addition to adhering to the appropriate copyright terms.

In a wider sense, Project Gutenberg (which was founded in 1971) is something of a precursor to many of the open-data-projects that are popular today — Wikipedia, the Internet Archive, OpenStreetMap, etc. They all take the crowd-sourced, volunteer-driven model that Hart made popular, and use it to provide free access to information, for everyone. That's a pretty good legacy.

Linux How-Tos and Linux Tutorials : Weekend Project: Get Grammar Checking for Your Open Source Office Suite

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Sepllchecking. I mean, spellchecking: it’s such an integrated part of our word processing and email workflow these days that we feel ripped off when an application (or phone…) <em>doesn’t</em> have it built-in. Sadly, grammar-checking is a little bit behind. Checking words against a dictionary is trivial, but picking out parts of speech and sentence structure is trickier. Some proprietary office suites include grammar tools built-in, although the free software suites do not. But there are plug-ins available that bring grammar and stylistic help to all of the major open source word processors.

Sepllchecking. I mean, spellchecking: it’s such an integrated part of our word processing and email workflow these days that we feel ripped off when an application (or phone…) <em>doesn’t</em> have it built-in. Sadly, grammar-checking is a little bit behind. Checking words against a dictionary is trivial, but picking out parts of speech and sentence structure is trickier. Some proprietary office suites include grammar tools built-in, although the free software suites do not. But there are plug-ins available that bring grammar and stylistic help to all of the major open source word processors.

Naturally, grammar-checking is tightly bound to the language of the document. Thus each grammatical tool must add explicit support for each language it supports. A few projects incorporate more than one language, but for non-English writing, it is advisable to look for a language-specific plug-in for your word processor. It is likely to be of higher quality and have a wider rule-set if it is maintained by native speakers, as opposed to a research project or one-size-fits-all generic grammar framework.

Another, somewhat related caveat with grammar-checking is that most of the time, grammatical rules cover only generic conversational language. If you are writing a technical document (particularly one about programming with its host of reserved words), you are liable to get lots of “false negatives,” so to speak: points where the grammar checker thinks you have made an mistake, but in reality you were using a technical term that looks wrong if you don’t know better.

A separate bit of fall-out from this language dependency is that I, as a moderately-fluent English speaker, do not find it easy to assess the quality of grammatical tools designed for other languages. I have found a few, but if you have others or have a strong opinion, please feel free to share them in the comments. Please also share any tools or scripts you find for Calligra, the KDE office suite — LibreOffice, OpenOffice, and AbiWord have plenty of options. But I did not have any luck locating options for Calligra.

Grammar Hammers

LibreOffice, OpenOffice, and AbiWord all have extension mechanisms, so even though there is not a built-in grammar checker, you can find plug-ins to add in the functionality. As of right now, the Big Two in this arena are LanguageTool and After The Deadline. They offer grammar engines suitable for use in the major office suites but also in other environments.

LanguageTool can run as a standalone Java application, or be installed as an extension for LibreOffice, OpenOffice.org, or (with a little sleight-of-hand) other applications. Each language pack is maintained separately as a collection of rules and exceptions. The current version has robust support for English, German, French, Polish, Dutch, Romanian, and Russian, and supplementary support for about 19 other languages. You can even browse the language rules data online; if you want to contribute to growing support for your language, that is a good place to start. There is also a new rule-conversion tool added as part of 2011′s Summer Of Code to help grow the language support.

After The Deadline uses a client-server model. The makers of ATD run the primary server, to which the various extensions and plug-ins connect. But you can also download the source code and run your own server if you so desire. Extensions are provided for LibreOffice, OpenOffice.org, and various other applications (such as Firefox and Chrome). App developers can even incorporate it directly into other applications through the API. ATD only supports English at the moment, but there is work underway to extend it. It sports some features that LanguageTool does not, including differentiating between grammar and “style” problems, and an integrated spell-checker.

Between the two, LanguageTool offers broader language support, but ATD offers hooks into more applications (you can even integrate it into your WordPress blog). A long time ago, ATD started out as a fork of LanguageTool, and although they have different areas of emphasis now, there is no reason you cannot install both and adapt to whichever one gives you the best results.

Elsewhere in Morphology-ology

AbiWord has its own grammar engine project, although depending on your platform and distribution, it may not be installed by default. If not, you can download it — or grab recent updates — from the AbiWord site. It is based on open source work done at Carnegie Mellon University and at Open Cognition. The focus of the plug-in is on English, but there is support for German, French, and Lithuanian as well, plus special work to enable support for scientific and medical text.

Beyond these three, most of the extensions available are single-language tools. LibreOffice has separate extensions available for Russian, Portuguese (Brazilian), French, and Irish. OpenOffice users can find extensions for Portuguese, French, Esperanto, Russian, and Hungarian.

Grammar checking is a subject of ongoing academic research, and there is no one ideal method. So if you don’t find the perfect grammar tool for your word processor of choice and language, it is a good idea to keep checking back periodically. You’ll find other tools out there in various states of readiness; the LanguageTool site has a good list, as does the LinguComponent section on the OpenOffice site.

Who knows? You may even get motivated to reinvigorate an existing project, like Graviax, or to help port and debug one of the older OpenOffice extensions over to LibreOffice. Or maybe you can just contribute back by strengthening the support in one of the existing projects for the languages you know.

Linux How-Tos and Linux Tutorials : Podcasting from WordPress

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

WordPress is the leading open source blogging platform for good reasons, but what do you do when the written word no longer suffices? Sure, you can attach an audio or video file to a WordPress post, but if you are interested in managing a professional-caliber podcast from your server, you need more. Let's compare the alternatives.

WordPress is the leading open source blogging platform for good reasons, but what do you do when the written word no longer suffices? Sure, you can attach an audio or video file to a WordPress post, but if you are interested in managing a professional-caliber podcast from your server, you need more. Let's compare the alternatives.

For starters, it is important to set our expectations. There are full-blown podcasting services that will host your program and do all of the lifting for you — but we are not interested in that. We are interested in self-hosting the podcast on a machine that we control, either our own LAMP stack or a similar virtual server from a typical Web hosting provider. Otherwise, where's the fun?

Still, in addition to producing a valid podcast feed that audio and video clients can use on any platform, we want the same level of configurability that we've grow used to with WordPress's blog feeds, such as the ability to make our podcast feed available separate from our text feeds, stats that are actually helpful, and for open source friendliness, support for free codecs in addition to MP3 and the video flavor-of-the-month. It would also be nice to make sure our work gets picked up and recognized by the popular podcast directories.

WordPress and Its Built-In Features

At a technical level, a podcast feed is just a normal RSS or Atom feed that includes a link to a media file inside an <enclosure> tag. When you compose a post, you can attach a media file (audio or video) from within the post editor, and WordPress will automatically create the enclosure. But all an RSS subscriber or visitor to your site will see is a generic hyperlink in the body of your post, not in-browser player that lets them enjoy your insights and witticisms.

And that post will be mingled in with the rest of your content — not a bad option if you are using your WordPress installation exclusively for your show, but awkward if you want to do both, and ultimately unfriendly towards listeners who want to subscribe with their audio player.

Still, there are a few things you can do to whittle generic WordPress audio/video enclosures into a suitable podcast feed. For starters, you can create a separate category for your podcast content (in the Dashboard, go to Posts -> Categories). Using podcast as a category name, the category-specific feed automatically produced by WordPress will look like http://yourblog.com/category/podcast/feed. You can use multiple categories and effectively provide multiple podcast feeds this way, using a single WordPress blog.

Category-specific feeds may be automatically created, but for your audience to see them, you will also need to make them visible in your theme and layout. You can manually add a link to the correct URL in the sidebar or masthead, or use a plugin like Subscribe Sidebar.

To provide a simple audio player for Web visitors (who might want to sample an episode before subscribing in their dedicated podcast client), you can add another plugin like Audio Link Player or WPaudio. Be forewarned, however, that although there are lots of embedded audio player options, many of them are Flash-based, and few offer support for Vorbis or other media types, so if those qualities matter for your feed, double-check.

Providing a video player for video-podcasts is more difficult, but there are several options to choose from as well. The Hana FLV player is a Flash-based option that is actively maintained. Alternatives that use HTML5 elements are out there, too, such as VideoJS.

Regardless of whether you publish audio or video, you also need to worry about providing helpful metadata (such as an episode summary for each post, and a show description and/or "title card" image) for users that find your show through iTunes, Miro, or some other podcast directory service. There are add-ons to handle this as well, such as Podcast Channels.

Dedicated Podcast Management

But then again, if you take that route, you end up having to manage three to five separate plugins. They may not all receive updates at the same time, and keeping them configured just costs you time. A simpler option is a dedicated podcast-management plugin, of which there are several to choose from.

One of the oldest options is a plugin called simply Podcasting. It includes metadata management, support for producing multiple, independent feeds, and a Flash-based audio/video player. It also allows you to host your underlying media files on a separate server from your blog itself. Some people prefer to do this just to save bandwidth and drive wear (which is a bigger concern if you run your own server), using Dropbox, Archive.org, or a range of other Web storage services. If you want torrent content delivery, however, you can only do that with a separate plugin.

Podcasting adds tools to the WordPress post editor that allow you to upload or link to a media file, and add episode-specific information. There is a separate configuration page in the Dashboard to manage show-wide metadata and publication options. You can also customize several attributes of the player, and can set up separate feeds for separate codecs.

The podPress plugin is a newer alternative that handles the same basics, but does noticeably more than Podcasting. For example, support for non-MP3 codecs like Vorbis is built-in, whereas Podcasting requires you to set them up manually. You can also automatically include a "preview" image for video posts, rather than loading the video player. Flash and HTML5 players are built-in.

From a management perspective, podPress allows you to see and adjust the ID3 tags embedded inside your media enclosures, which can be important to maintain compatibility with third-party directory services (the iTunes directory in particular). PodPress can also produce dedicated itunes: subscription URIs to launch the iTunes client from the browser — on those OSes where iTunes is available.

PodPress also allows you to maintain a separate "paid subscription" feed that you can use to monetize your hard work. There is also a full-features statistics package that lets you keep tabs on your download an subscription popularity.

The third and final major podcast-management option is Blubrry Powerpress. Like the name suggests, this plugin is designed to interact with the third-party podcasting Web service Blubrry, but you do not need to buy a Blubrry account to use it.

Like podPress, it includes show and episode-specific metadata management from the Dashboard, a choice between HTML5 and Flash-based Web players, and support for multiple podcasts within the same back-end. Powerpress even includes multiple media player options covering a wide range of styles and overhead. Powerpress's media support is excellent, even including the royalty-free WebM video codec. You can also choose to store your media content on an external site, such as YouTube, Blip, or Ustream.

Blubrry hosts the statistics package used by Powerpress, and offers a paid "premium stats" service, although there is also a free basic service available. Blubrry can also host your media files (for a price) and provide a directory service. There are several other interesting features offered by the plugin itself, such as diagnostics, the ability to have your Web player embedded on other pages (to help with sharing your content). Perhaps most interesting is that Powerpress supports offering podcasting as a user-role option for multiuser blog setups, and offers full support for WordPress MU.

Roll Credits

Whichever route you take, you will find using WordPress to create and publish your podcast to be the easiest step in the process. Far more difficult is the task of spreading the word and attracting listeners or viewers. There are plenty of plugins and tools to help you push out new notifications over social media networks, but by and large marketing your work is a hard slog. You have to make your show visible in the popular directories, from the corporate-controlled iTunes to the community-driven Miro Guide. At least you won't have to expend extra mental cycles on the technical bits.

Linux How-Tos and Linux Tutorials : GNU Emacs 101 for Beginners

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

No matter how slick Linux on the desktop gets, there will probably always be a gulf between new users and veterans for whom the Linux environment has become second nature. Root accounts, the filesystem hierarchy, the dizzying array of distributions — they seem strange at first, but eventually new users learn that are nothing to be feared. So too with the "programmer's editors" Emacs and Vi/Vim: the first encounter can be intimidating because of the unfamiliarity of either editor. As an Emacs user, I'm interested in providing some advocacy, so if you've been wondering just what the point of an editor like Emacs is, or if you're not sure how to get started, read on for a gentle introduction.

No matter how slick Linux on the desktop gets, there will probably always be a gulf between new users and veterans for whom the Linux environment has become second nature. Root accounts, the filesystem hierarchy, the dizzying array of distributions — they seem strange at first, but eventually new users learn that are nothing to be feared. So too with the "programmer's editors" Emacs and Vi/Vim: the first encounter can be intimidating because of the unfamiliarity of either editor. As an Emacs user, I'm interested in providing some advocacy, so if you've been wondering just what the point of an editor like Emacs is, or if you're not sure how to get started, read on for a gentle introduction.

Why Emacs?

Naturally, you can choose any of several dozen applications to edit your files with, from lightweight utilities like Nano to full GUI systems with plug-ins. So why do people like Emacs? Two reasons.

First, at the core of Emacs' appeal is the fact that it is easily customized. Technically speaking, Emacs itself is an interpreter for the Lisp programming language, and it simply starts up running a text-editing environment. But you can change everything about it, adding and removing functions, modifying features, and tying in to other programs. Although most people do not study Lisp, it is very easy to learn, and there are plenty of simple modifications you can make to bend Emacs to your will in mere minutes. That's not true of most editors.

That said, you don't have to change anything about it in order to get started. Which leads to reason number two: the developer community around Emacs is huge, and as a result there is a vast ecosystem of tweaks and extensions that you can find and install without any effort. Sure, you could write your own re-programmable editor from scratch, but building up a vibrant community of users and contributors would take forever.

In short, you might think of Emacs as the Firefox of editors. There are other options, and lots of them are good. Some might even offer more speed or a smaller memory footprint. But the advantage Emacs offers is its customizability; with add-ons you can tailor the program to fit exactly how you want to work.

The Basics

A good chunk of Emacs' intimidation factor stems from the conventional wisdom that the interface is totally unlike other Linux applications, so you have to learn a lot just to get started. It is true that Emacs has been around longer than all GUI desktop environments, and as a result has a legacy keybindings (and names for things) which differ from the standards we've picked up from the Windows and Mac worlds.

But you don't need to care about them, because on today's desktop systems, you can run Emacs as a full-blown GTK+ GUI editor, complete with toolbar, menu bars, scroll-wheel support, and all of the other conveniences. Thus, my recommendation on all the Emacs cheat-sheets and references is simple: don't bother. If you just want to get started, fire up Emacs and use the menus — you can look into keybindings later.

EmacsIf that worries you … okay, just memorize this one command, and you'll have no trouble: Ctrl-]. That's holding down the Control key and hitting the right-bracket key. What this command does is stop whatever command sequence or function you're currently in, and return you to plain-ole editing. No matter what trouble you think you've gotten into (such as a mistyped keystroke that starts something unfamiliar), Ctrl-] will end it and set you back down on your feet, pronto.

Emacs is a standard application these days, so you can install it from your distribution's package manager. You may even find several versions available, because people like to fork it and maintain specialized builds. If in doubt, choose the vanilla Emacs package. The most recent release is 23.3, from May 2011.

When you fire up Emacs for the first time, you will see some startup messages flash across the new window, but in a moment it will settle down and be ready to use. Across the top are the menus and toolbar, down the side is the scrollbar, and at the bottom are two special lines akin to a "status area." The next-to-bottom line is mostly dashes, and is there to separate the editing portion of the screen from the line below. But this line will also tell you the line number the cursor is on, the name of the currently-open file, and highlight a pair of asterisks if the file has been modified. The very bottom line is called the "minibuffer" in Emacs-speak. It shows messages and alerts, and whenever you type a control key-sequence, the keys you press appear here so you can see that you typed correctly.

By default, whenever you launch Emacs it opens a temporary file called "scratch" in the main portion of the window. Emacs refers to the files it has open as "buffers" (which reflects the fact that they don't get written to file on disk until you say so). The scratch buffer has a short paragraph at the top explaining that nothing you type in scratch gets saved. That paragraph starts with semicolons, because semicolons are the comment-delimiter in Lisp. See, you can actually type Lisp code into scratch and have it executed. But you don't have to.

At this point, you can open a file from the file menu, edit it, cut and paste, and save it all from the menus. As you do so, you'll pick up on some Emacs-isms. For example, Emacs rarely if ever uses pop-up dialogs. If it needs to ask you something (such as what word you want to search for in a text search), it does so in the minibuffer — or, worst-case-scenario, it will expand the minibuffer up a few lines, for instance to let you choose a file from a directory listing. This is because Emacs predates modern Unix windowing systems, and to this day its authors make sure it still functions correctly in a windowless environment.

You can also see Emacs' keybindings next to their respective commands in the menus. This is a good way to get used to the keybindings. Another Emacs-ism is the fact that way more commands have keybindings, because Emacs lets you bind two-letter commands. For example, Ctrl-X Ctrl-F opens a new file. That seems unusual at first, until you notice that all of the file-management keybindings reside in the same "block": Ctrl-X Ctrl-S saves a file. Besides , the other important key is Meta. It's an anachronistic name — on almost all keyboards you use Alt for this key, but there are still some exceptions, so the name sticks. Ctrl and Meta are displayed in the minibuffer and menus as C- and M- respectively.

Modes

Now let's talk about that (Lisp Interaction) label at the bottom of the screen in the scratch buffer. It tells you that you are working in Lisp mode. If you open or create a plain text file, the mode will change to (Text).

Emacs modes are a big part of why programmers like the editor. There are modes for every conceivable programming language, and they assist you with writing code in several ways. They can highlight syntax, auto-indent, complete commands, match parentheses, and allow you to jump through large portions of your code at once (such as jumping directly to a function definition, if you have run across a function call and need to refresh your memory).

In addition to programming languages, there are modes to simplify work with particular file formats (such as TeX) and utilities (such nroff), and "minor modes" that change smaller aspects of their parent mode. An example of this would be Text-Based Table mode, which lets you navigate around a WYSIWYG-style table in plain text, far faster than manually lining up your columns and rows.

There are also Emacs editing modes that have nothing to do with programming, such as Picture mode for "ASCII art" and outline mode, which lets you collapse and expand nested outlines. In both of these cases, the special mode simply changes the way the cursor moves around the document, rather than changing the contents of the file in unusual ways. But these add-ons are not just cosmetic; you can also do serious work with them, such as encrypt and decrypt files, or compile and build code.

Even further out into left field are pure applications that are written as Emacs modes. There are modes for reading email, checking social networks, browsing the web, and playing games. In reality, these are Lisp programs, but by tying in to the Emacs framework, you can switch over to one of them and use less resources than you would firing up a separate program.

Your first power tricks, your first customizations

Modes are well and good, but you didn't come here to edit source code. To whet your appetite, let's look at a few of Emacs's interesting tricks.

Emacs has several powerful search-and-replace options, including full support for regular expressions. But perhaps the most intriguing to new users in incremental search. This search command lets you start typing your search term, and jumps immediately to the next match with every additional character you type. If you make it your default search method, you type fewer characters over all.

The recursive edit command, which you trigger with Ctrl-r lets you pause whatever action you are currently doing and resume editing (or reading) a file, or even executing another command. You can resume the suspended action with Ctrl-Meta-c. This can be especially helpful when you are in the middle of a long process, such as spell-checking a document.

When it comes to customizing the editor, a good place to start is with a ColorTheme. I use a dark GTK+ theme on my desktop, and don't care for the high-contrast black-on-white text in Emacs' default look. So I use a custom color scheme instead. The ColorTheme package is an add-on written in Lisp, which you can install through your package manager.

To activate it, you will need to follow the instructions on the ColorTheme project page, which tell you how to save the right commands into your personal customizations file, which is stored at ~/.emacs. This file contains any Lisp commands you want Emacs to execute whenever you start it up, and it can load settings, launch add-ons, or even define new functions that you write yourself.

Once you get comfortable editing .emacs, the odds are you will periodically find a new command you want to save for everyday use. For example, I have (fset 'yes-or-no-p 'y-or-n-p) in my .emacs file, a setting that turns off Emacs' demand that you type out the full words "yes" or "no" when it asks you a yes/no question. I don't save a lot of time by typing "y" and "n" instead, but i save a lot of aggravation.

More fun is defining new functions altogether. Here a good place to start is the Emacs Wiki, with a useful feature like word count. You will get a feel for Lisp programming, and add a helpful new feature at the same time.

The Emacs Wiki is a good place to start looking whenever the "I wonder if there's a way to … " questions start popping up. It is filled to the brim with tips and code snippets you can use immediately, and links to far more that are maintained elsewhere. Plus, it is actively maintained, so there is no real danger of finding content out-of-date enough to do any harm. Browse around, and you will get an idea of what other people are doing with Emacs, as well as why they like it. You might even come up with some ideas of your own.

Linux How-Tos and Linux Tutorials : DocVert Can Handle All Your Document Conversion Needs

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Thought you'd be living in a Microsoft Office-free world by 2011? Unless you're in a Linux-only shop that does business only with other Linux-only shops, the chances are that dream remains a few years away, and you still have to drag out an office file converter periodically. The trouble is, each free software office suite has its own, and they vary in their capabilities. Enter DocVert, a worthy GPLv3-licensed utility to keep at the ready, thanks to its choice of CLI- and Web-based interface options, and its flexible output formatting.

Linux How-Tos and Linux Tutorials : Rapid Releases: How Are They Working for Firefox and Thunderbird 6?

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

The Firefox 6 browser and Thunderbird 6 email app are both out now, and if it seems like just yesterday that you were reading about Firefox 5 — no, it's not your imagination. Both releases are part of Mozilla's new rapid release strategy, which means there are fewer new features in each version, but hopefully less breakage as well. In this case, Web developers get some new tools on the browser front, but Mozilla still has major problems to iron out of the new release approach.

Linux How-Tos and Linux Tutorials : Weekend Project: Turn Thunderbird into a Groupware Client with SOGo Connector

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Mozilla Thunderbird has earned its place as an email app on Linux desktops, and its cross-platform nature makes it a popular choice for Windows and OS X machines as well — particularly when you include the calendaring and task management power of the Lightning add-on. But uptake in corporate environments has never been its strong suit, largely because it does not include groupware features such as event invitations and shared address books. The SOGo Connector extension adds those features and more, turning Thunderbird into a proper groupware client suitable for business use.

Linux How-Tos and Linux Tutorials : Review: RawTherapee 3.0 on Linux

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

The open source raw photo editor RawTherapee released version 3.0 at the end of July, with a revamped interface and a new palette of photo tools. RawTherapee is noteworthy for several reasons, including the fact that builds are available for Mac OS X and Windows, in addition to Linux. But this release also marks the first major update to the program since it was made free software. Let's take a look.

Linux How-Tos and Linux Tutorials : The Five Best WordPress Discussion Plugins

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Running a open source CMS like WordPress on your own LAMP server is supposed to give you the freedom to develop a better system than you would get from pre-packaged blogging services. Yet a lot of WordPress users effectively outsource their site's discussions by installing closed commenting systems like Disqus or IntenseDebate. In addition to handing someone else the keys to your community interaction, these third-party services can also be slow to load and respond to updates. Let's look at the open source alternatives.

Linux How-Tos and Linux Tutorials : A Look at the Filesystem Hierarchy Standard 3.0

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

It was big news when the 3.0 kernel was released at the end of July, but as luck would have it, another fundamental piece of your average distribution is about to bump its own version number up to 3.0 as well: the filesystem hierarchy standard (FHS). If you're not sure exactly what that means or why you should care, don't worry. It's the distros that implement the FHS — when it goes well, all you know is that your system runs smoothly. But that doesn't mean there's nothing important hidden away in this new release.

Linux How-Tos and Linux Tutorials : A Look at Mozilla’s BrowserID Project

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Mozilla launched its BrowserID project in June as an alternative to the now commonplace practice of relying on third-party commercial Web services as "identity brokers." If you want to offer users a non-intrusive way to establish an identity, without the overhead and security worries of an awkward off-site authorization process, it is worth taking a look.

Linux How-Tos and Linux Tutorials : Weekend Project: Use HoneyD on Linux to Fool Attackers

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

For the security conscious, there is always room for another weapon against attackers. Firewalls, intrusion detection systems, packet sniffers — all are important pieces of the puzzle. So too is Honeyd, the "honeypot daemon." Honeyd simulates the existence of an array of server and client machines on your network, including typical traffic between them. The phantom machines can be configured to mimic the signature and behavior of real operating systems, which will trick intruders into poking at them — and revealing themselves to your security staff.

Linux How-Tos and Linux Tutorials : RecallMonkey Brings Simple Search to Firefox History

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Where was that page you were just looking at a few days ago? If you're a Firefox user, answering that question just got a lot simpler. Yes, the browser offers simple bookmarking and niceties like bookmark folders and multi-browser sync — and yes, there are hundreds of add-ons to help you categorize, tag, export, import, and file your links in every conceivable way. But all too often, re-finding that page you just took a casual look at is still easier to do with a Web search engine — and that's the premise behind RecallMonkey, which gives you a search-engine-like interface to your own browsing history.

Linux How-Tos and Linux Tutorials : Screencasting Stars of the Linux World

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

Are you still taking screenshots? That is sooo last decade. Today if you want to showcase your application, your gaming skills, or even your astonishing new desktop wallpaper collection, you need a screen recorder (or screencasting tool) to capture full-motion video and audio of your desktop. You’ll find several solid options, but which one works best for you depends a lot on the type of content you need to capture, and what you intend to do with it.

Linux How-Tos and Linux Tutorials : PiTiVi Video Editor Now Kitten-Friendly

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

PiTiVi is a GStreamer-based non-linear video editor (NLE) developed by members of the GStreamer project itself. That means it is often the first project to showcase new features, and last month's new release is no exception. The major new feature is support for audio and video filter "effects" but there are usability and speed improvements worth examining, too.

Linux How-Tos and Linux Tutorials : FreeNAS 8.0 Simplifies Storage

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

The FreeNAS distribution is tailor-made for installation in a small office environment. It is an extremely low-resource network storage system that you can administer through a Web browser, but it supports high-end features like automatic backups, replication, LDAP or Active Directory authentication, and seamless file-sharing over NFS, CIFS, AFP, and even FTP and TFTP. The latest release — version 8.0 — is just a few weeks old, and it is the perfect time to take a look.

Linux How-Tos and Linux Tutorials : Weekend Project: Setting up DNS Service Discovery

This post was syndicated from: Linux How-Tos and Linux Tutorials and was written by: Nathan Willis. Original post: at Linux How-Tos and Linux Tutorials

DNS Service Discovery (DNS-SD) is a component of Zeroconf networking, which allows servers and clients on an IP network to exchange their location and access details around the LAN without requiring any central configuration. Most Linux distributions supply the Avahi library for Zeroconf support, but not nearly as many users take advantage of it. Let's look at an easy-to-set-up use for DNS-SD: providing automatic bookmarks to services. All it takes is an Apache module and a Firefox extension.