2009 in review

Posted: Wednesday December 30, 2009 at 8:13 p.m.

So much for updating this blog more regularly. I guess this blog simply doesn't rank high on my list of priorities. Work has been, for lack of a better word, intense! Fortunately it has been so in a good way. There has been no shortage of challenges, most of which were technical, challenging and generally interesting. And the really good news is the work continues.

From a personal projects perspective, I've had to pick and choose which to run with. Failure to do so would have left me with a hollow list of ideas that never amounted to anything beyond entry level academic exercises. I have played around virtualization some early in the year using OpenVZ. I like the results, but have found it was not the good bet for my future use, as my work has started to turn to VMWare and Solaris Zones in a big way. Shortly after my post in February, I was able to complete an upgrade of the Django based websites. I am still quite pleased with Django, though I've not been keeping up to date on that scene very well.

I've spent most of my free time focused on Scala Bazaars (sbaz). I've been improving the sbaz code base in earnest only since June or so. I am a bit ashamed it took me so long. This is in part due to my needing to better familiarize myself with the existing code. The bigger reason was lack of vision, and therefore unsure direction. I spent more time contemplating what to do instead of doing. Sbaz has lost ground as a tool used by the public due to a variety of reasons. First and foremost has been the waning development of both code base and public repositories. I am attacking the former head on, and the later soon after. A more basic question is what use cases we should target. Most prospective users today are developers who have other tools that are more deeply integrated with IDEs, build tools, version management, etc. Should sbaz adapt to compete in the development space? Sbaz was modeled after the Linux package management tools, and I'm not convinced such a design can satisfy both the flexibility and consistency needs of developers unless we can establish some large scale release process similar to Linux distributions. My next post, if it happens within the next month or so, will likely be ramblings around some of these thoughts, and how I would like to see a sub-community start to form around sbaz.

Ultimately, I decided on a path of incremental improvements to existing functionality. This way I could contribute as little or as much as my existing family responsibilities and increasing work commitments would allow. My primary goals this far have included:

  1. Update to scala 2.8 and eliminate all compile warnings, including Java deprecations
  2. Introduce no backward incompatible changes (beyond minimum Java5 per 2.8 upgrade)
  3. Improve dependency audits to prevent a broken managed directory from ever appearing
  4. If an audit fails, make it clear to the user why
  5. Keep the user informed (specifically when downloading)
  6. Expand the body of automated testing
  7. Improve Windows support

These are still works in progress, but there has been good progress made. I have also added a few features like asynchronous downloads and support for pack200, the latter I'm amazed I haven't seen used more. I suppose the scala requirement of java5 or later makes such support easier..? At the time of this writing, there is still more to be done; however, the code base isn't far from being ready for the next major distribution release of Scala. I'd be working on it now if I wasn't away from my primary development machine.

scala   (0)

So many projects...

Posted: Sunday February 22, 2009 at 1:46 a.m.

... and so little time. I really should make a point of updating this blog more frequently, as I've been doing a lot of new and interesting (at least to me) technical stuff in the *nix space. Up until the end of last year, my day job has primarily focused on application capability. What I mean is I spent most of my time working on application specific configuration and code changes to add features or fix bugs to better support the business's needs. Most of these applications are large, complex enterprise tools that fall into the "one size fits all because you can customize through configuration" category. Granted, the line between code and configuration is unclear at times. On occasion, I would get down to the application framework or operating system level. It was always a treat.

Late last year, a key person for the company website's site operations team moved on, and I was pulled in to fill a fraction of the void he left behind. I'm only working there at half capacity because I still need to support my previous areas, but it has been a refreshing deviation from the work I've been doing for the past 7 or so years. And when I get interested and excited about something, I begin to explore.

Virtualization

One area I've spent some time exploring is system virtualization software, and this is for several reasons. I only have one machine at home to do all my exploratory work. This poses a problem when playing around with load balancing configurations or testing network software. Virtual servers allow me to simulate many machines on a single box, so while it isn't perfect for emulating a production environment, it makes it possible with a non-existent budget. Trying out different flavors of applications can also have a negative effect (sometimes damaging) on a box. Installing, configuring, testing and then uninstalling can lead to garbage being left behind. If the install is invasive to the basic services of a box and you don't fully understand all aspects of these changes, it can interfere with the health of the system in the long term. Using a virtual environment for this kind of playing around means you can simply discard the environment if you decide not to go with that solution. One added benefit of installing critical services into a virtual environment is you can easily back-up and transport that environment into another machine if the host box goes belly-up. Maybe I should get another machine...

Website - the whole package

I've had a few websites outside of work that I've developed (with someone else's code as a starting pointing) and maintain, this site being one of them. I've never put a proper backup/restore process into place for full, rapid restores. In the event of a failure, I would be able to restore most, but it would take a long time and some stuff may fall through the cracks.

Django 1.0 came out a while ago. I've successfully upgraded this site, but have some more upgrade work to do yet. This will likely take about 3 days to complete, if I could work on it full time. Unless I can use some dedicated time over a long weekend, it will probably grow into 3 weeks.

This site has only held my blog this far. I think I want to expand it to serve as my project site as well (of which I currently have none), as I've got bandwidth and storage space to spare. Source code will be hosted via mercurial, possibly through hgweb. I haven't decided if I want to use Trac. This partly depends on its RAM consumption because I need to keep several apps running on this hosting service, and memory is my limiting factor. I've already done some promising prototyping of a move from mod_python to mod_wsgi, so I appear to be making myself some wiggle room.

My Scala + db4o + Swing personal project

I've blogged about this project previously. I've succeeded in putting together a lot of the domain specific code this far, but I'm now toiling with the application framework (API for managing the main window, opening/closing views, starting and stopping services). I haven't decided on the user experience yet: single tabbed window, multi window, etc. The options are plentiful, but my goal is to keep it light weight with the potential for using it in a smart phone. I've started going down the path of OSGi (another learning experience). That may or may not continue. This is one project I would like to host out of this web site.

sbaz - Scala's package manager

I am presently the named maintainer for sbaz. Sbaz will start to become a bigger part of this blog. Presently, I am still struggling with defining a path forward for sbaz. I'm considering hosting some sbaz universes out of this site (e.g. my personal project above, a managed developer release universe, etc.). My biggest obstacle with this is my limited memory resources, so deploying the existing Java servlet won't be possible. Perhaps using PHP, Python or flat files to host a universe should be my first enhancement.

General system administration notes

I need a reliable location to document many of my system administration learnings, and this website seems to be a logical location. To date, I've peppered README files throughout my machine's filesystem, but the point behind the README files is to repeat the steps I need to perform on a new (e.g. replacement) machine. I've also failed to document some good stuff because I hope to, at some point, have a central repository. So, I delay documenting until I forget to do it all together, a pattern I've recently recognized and am changing... leading to the before mentioned README files. Some of the things I intend to document include:

db4o | misc. | scala   (0)

A new personal project

Posted: Friday June 6, 2008 at 10:59 p.m.

After coming to the disappointing conclusion that I cannot, at this time, pursue building my NAS device from scratch, I've turned toward another long contemplated and frequently revisited project. For quite some time I've dabbled with various Java based functionality, such as db4o, Scala and RCP frameworks, and have been toying with some ideas to implement a useful application from scratch.

I know it is pretty sad that a professional software developer of 7 years is psyched about creating an application from scratch, but most of my development work involves configuring and/or extending existing applications purchased from third party vendors, identifying and fixing bugs in code others have written, addressing performance issues, and interfacing multiple enterprise applications. I have done very little UI development, and most of that has been for web applications.

Standing at the start of my desktop application exploration, I realize my first goal is to implement something simple and useful with an expandable design. The tool needs to be useful to keep my interest, otherwise the application will be yet another academic exercise. The simple but useful feature provides the bait that leads me along, but by having an extensible design, adding additional features becomes incremental growth instead of titanic rewrites and redesigns. So, I have the following thoughts:

  1. The application should run on the JRE and use the Scala programming language for at least part of the functionality. I am learning to love functional programming, but haven't applied it to any practical application yet.
  2. If using a database, I would like to use db40 instead of a relational database. A lot of the stuff that slows down development with a database goes away when the Java class IS the database schema. No dual implementation (Java class and relational schema) or mappings are required. Of course, I may be better off persisting my data as flat files.
  3. I should standardize on a single IDE (Eclipse or NetBeans), as this is primarily a single developer effort, and trying to develop in two different IDEs would require a lot of overhead. Unfortunately, this adds limitations to the technology I can use for development. For example, Scala only has a plug-in supporting Eclipse, I am more familiar with Eclipse's features, I like the idea of learning more about OSGI, and I like the responsiveness of SWT vs. Swing. On the flip side, only NetBeans has Matisse (for free anyway), the profiler seems nicer, RCP appears to be easier, I'm more familiar with the Swing API, I like the idea of a single distribution for all machines (SWT requires different DLLs), and I LOVE the mercurial plug-in.
  4. I don't expect this code will be opened for general availability, but I should take legal considerations (GPL, LGPL, EPL, BSD, etc.) into account when choosing libraries. Otherwise, the conflicting licenses could become prohibitively difficult to work around.
  5. I would like to leverage a multi-document interface (MDI), at least optionally, with tabs that can be reordered, maximized to full screen, closed with middle button click, and possibly minimized to border buttons.
  6. Ideally, data could be replicated between machines with ease, as I would like to use the application on my home and work machines. This, of course, requires multi-platform functionality, as my work laptop runs Windows XP while my home machine runs Ubuntu Linux. (Gosh how I wish I could run Ubuntu at work too...)

Clearly, I have more to think about before choosing an IDE. As for the applicaiton's features, I've considered the following:

  1. A secure password store
  2. A wiki style notebook similar to Tomboy Notes for Linux. I found that taking notes in ASCII with a markup "language" like markdown worked quite well. The file's name contains the timestamp of when the file was created, and all context of what the note was about is contained within. When the file is stored in the file system, the operating system's indexing tool can be leveraged for searches, and the application can manage categorization in a flexible way. I'd like to include IDE style tab completion for internal links, syntax highlighting, daily todo list feature, hyperlinks outside the application (e.g. web pages, lotus notes documents, network share locations, etc.) and more.
  3. A batch image resize tool for high quality images. I've actually already implemented this using the ImageJ library available in the public domain, but the GUI is a bit hackish and not robust.
  4. A contact management application that leverages low level relationships. For example, a family of five may all have the same contact information while the kids are young, but as each kid gets his/her own email address, mailing address (e.g. goes to college), cell phone, etc., each individual's contact information can be updated accordingly. Or, if the entire family moves to a new location, a single change to their mailing address node would update all family members at once. This would be a perfect application for db4o.
  5. An invoicing and general business tracking application for my wife to use for her photography business. This is a nontrivial piece of functionality with the potential to grow significantly. It could be used to drive marketing, schedule photo shoots, track income and expenses and so on.

I really need to have an IDE that makes the GUI layout and interaction with the business logic clear and easy to do in order to implement all that I would like. I suspect this seem daunting mostly because it is new to me. I'm still looking for some best practices on how to layout the MVC design of the GUI, though I suspect I'll come to a solid conclusion only after I've tried it.

db4o | Java | NAS | RCP | scala   (0)

NAS put on hold

Posted: Saturday May 31, 2008 at 12:29 p.m.

As it turns out, life really is what happens when you are making other plans. Since the last time I posted, I've had a few things come along requiring big money... bummer. I guess I'll have to wait a little longer before I build this machine. I suppose this isn't all bad, as chances are the cost of these components go down over time, assuming the economy doesn't throw a wrench into that theory.

I did get close to a finalized configuration of components with a slight change to the machine's application. Originally, I wanted to make the Network Accessible Storage device double as the firewall and gateway for my home network to the internet. For security purposes, combining the firewall and storage isn't recommended, as a security failure in the firewall makes for easy access to all your data. Beside that is the fact I already have a wireless Netgear router that can also serve as a gateway for a local wired network.

I currently have a 31" LCD television, a HDHomeRun (network enabled digital TV receiver), and no way to connect the two. A simple and lightweight media PC would nicely bridge this gap, and many of my NAS requirements would also apply here. With a media PC, I would finally have the convenience of TiVo style TV watching, and I could create backup copies of the kids' videos that are at high risk of death by scratching. Of course, the sheer volume of data is a concern when dealing with multimedia, particularly video. To get the most out of my hard drive space and back-up media (CDs and DVDs), the machine would need to support efficient MPEG4 decoding at the very minimum. I have a Core2 duo workstation that could offload the conversion processing from MPEG2 to MPEG4 (or better), so I'm not too concerned with the media PC's processor power as long has it supports hardware decoding of MPEG4. Given these thoughts, I've put together the following list of components:

I already have the hard drives and PCI to SATA 1.5 expansion card. These will need to be moved from their existing machines into the new one once it is built.

I haven't yet decided how I want to boot the machine. Idealy, the hard drives would exist solely for storage. No applications would be installed on them at all. There is one more IDE device supported (in addition to the optical drive), so I could install another hard drive. However, I would like to avoid using another hard drive for space and power reduction. I was thinking something like a Compact Flash card installed as a non-removable IDE device (requires adapter card) or a USB flash drive wired directly to the motherboard and rigged inside the case. I suspect the compact flash card would be more performant, but it would also cost a little more.

The parts listed here aren't super expensive. Unfortunately, this isn't the only costs involved. To build the machine I really want, I need to build my own case from scratch. I not only need to buy the materials (not sure what I want to use yet), but I also have to purchase some tools, like a dremel. I have some rough ideas on how I would like to set-up the inside of the case. Maybe I'll put together some pictures in the future, but until then, here is a brief description:

With this layout, I'd like to get an overall size of 20cm wide x 20cm tall x 25cm deep, or roughly 8in wide x 8in tall x 10in deep.

I hope to sometime return to this project, as I suspect this would be a good little machine that is used a lot. Until then, maybe someone else out there could use some of my thoughts to build something similar. If you do or already have, please share your experience. I would love to hear about your successes and growing pains.

NAS   (0)

NAS - hard drive and transfer speeds

Posted: Saturday April 5, 2008 at 9:32 p.m.

Okay, so I've got the general idea of what I want, now it is time to hash out some of the specs. Where to start?

The working parts of the device should be modern or at least based on modern standards. Since I will be going to all the trouble of building this device from scratch, I don't want to be put into a position where a significant redesign is needed in the event of a hardware failure of some kind.

Generally speaking, a NAS device doesn't require much processing power. The largest bottleneck in performance will most likely be the network itself. Most home networks will use wired fast Ethernet (100 Megabits/sec), wired gigabit Ethernet (1000 Megabits/sec), wireless g (54 Megabits/sec), or wireless n (248 Megabits/second). Note, however, that these measurements use the bit as the base unit of measure, not the byte (8 bits) that most of us are comfortable with.

Fortunately, Wikipedia has an excellent collection of device bandwidths available at http://en.wikipedia.org/wiki/List_of_device_bandwidths that shows data transfer rates in both bits and bytes. It also appears to be a fairly complete list of device interfaces that may be used in this system.

One important point to keep in mind is the perceived performance of a NAS device will be effected by the slowest point in the data transfer chain. This includes the network adapter on the machine using the storage device and everything between the two machines. I've configured my home network to support gigabit transfer rates, so my NAS device should take full advantage of this. This means I should be able to get a theoretical top transfer rate of 1000 Mb (megabits) per second, which is equal to 125 MB (megabytes) per second.

Here is a breakdown of possible hard drive transfer rates for this device. SCSI is a bit out of my price range and overkill for a home solution.

  1. Ultra DMA ATA 66 - 528 Mbit/s = 66 MB/s
  2. Ultra DMA ATA 100 - 800 Mbit/s = 100 MB/s
  3. Ultra DMA ATA 133 - 1064 Mbit/s = 133 MB/s
  4. SATA 150 hard drive - 1500 Mbit/s = 187.5 MB/s
  5. SATA 300 hard drive - 3000 Mbit/s = 375 MB/s

The last three interfaces are all faster than the maximum transfer rate possible over a gigabit network and should be suitable for such an application. As stated before, other technical merits (hot pluggable, no restrictions on writing to multiple devices at once, newer standard with stronger future) make SATA the ideal option. In the current market, you would be hard pressed to find large SATA hard drives that don't conform to the 300 standard, and the price differences between 150 or 300 aren't significant. You are more likely to find disk controllers on the motherboard or expansion cards (e.g. PCI cards) that use the slower SATA 150 interface. Fortunately, virtually all SATA 300 hard drives are backward compatible with the SATA 150 controllers, so compatibility is virtually a non-issue.

Speaking of expansion cards, it is quite likely one will be needed in my device. This adds a new interface to take into account when assessing bandwidth bottlenecks. According to the Wikipedia page, the 32 bit PCI expansion slot running at 66 MHz (the most common kind of expansion slot today) has a theoretical maximum transfer rate of 2133 Mbit/s, or 266.7 MB/s. This means that one should not expect the best performance from a single SATA 300 or multiple SATA 150 drives connected to such an expansion card. Still, for the needs of a NAS device, a PCI expansion card supporting two SATA 150 drives should work nicely for expanding a RAID configuration. Possibly using a PATA style configuration (pairing a motherboard controller with a PCI controller for RAID1 mirroring arrays) would optimize reads and writes when dealing with large files.

NAS   (5)

by tag

recent posts

by date

feeds

recent comments