08 December 2008

Splunk - Google for log files

IntelliJ IDEA 8's UML package diagram view

Explore your code base with IntelliJ IDEA 8's UML package diagram view. The class view can be a little to detailed but the package view is a useful alternative to the Dependency Structure Matrix. A DZone article discusses it further.

Reflections - Java runtime metadata, scan classes

A Java runtime metadata analysis, in the spirit of Scannotations

Reflections scans your classpath, indexes the metadata, allows you to query it on runtime and may save and collect that information for many modules within your project.

Using Reflections you can query your metadata for:

* get all subtypes of some type
* get all types annotated with some annotation
* get all types annotated with some annotation, including annotation parameters matching
* get all methods annotated with some annotation

29 November 2008

CodeExplorer Intellij IDEA plugin

CodeExplorer looks like an interesting IntelliJ IDEA plugin, will try it out... tried it out, useful for visualising method calls within a single, or a few, classes

Alex Miller on use of the final keyword

Another Java Excel library

Came across another Java library for Excel, from TeamDev who seem to produce an interesting range of products

TeamDev — Java components & Java libraries. Java web components, Java AWT component, Java graph library & Java lib & API
# JNIWrapper enables access to native libraries and components from Java code without using JNI. The technology supports Windows platforms, Mac OS X (Mac PPC/Intel) and multiple Linux platforms.
# ComfyJ provides bi-directional communication between the Java platform and COM technologies (Java – COM Bridge), in pure Java language.
# JExplorer enables you to embed Microsoft Internet Explorer into Java applications as either a visual or non-visual component. It gives you full access to the native browser API, while requiring no specific knowledge of COM technology. It’s easy to integrate Java & Internet Explorer using JExplorer.
# JExcel enables effective integration of Microsoft Excel workbooks and spreadsheets into Swing-based applications. This is a powerful Java Excel bridge.
# WinPack is a free add-on to JNIWrapper enabling access to the Windows native API and libraries from Java code, entirely in the Java language.
# JxCapture is a cross-platform library providing a comprehensive Java screen capture API.
# JxBrowser enables seamless integration of Mozilla Firefox browser into Java AWT/Swing applications on Windows, Linux, Mac OS X (Intel and PPC-based) platforms. JxBrowser makes creation of a powerful Java browser is easy and simple.

Some development points by Miško Hevery

Guide: Writing Testable Code | Miško Hevery
Flaw #1: Constructor does Real Work

Warning Signs

* new keyword in a constructor or at field declaration
* Static method calls in a constructor or at field declaration
* Anything more than field assignment in constructors
* Object not fully initialized after the constructor finishes (watch out for initialize methods)
* Control flow (conditional or looping logic) in a constructor
* Code does complex object graph construction inside a constructor rather than using a factory or builder
* Adding or using an initialization block

Flaw #2: Digging into Collaborators

Warning Signs

* Objects are passed in but never used directly (only used to get access to other objects)
* Law of Demeter violation: method call chain walks an object graph with more than one dot (.)
* Suspicious names: context, environment, principal, container, or manager

Flaw #3: Brittle Global State & Singletons

Warning Signs

* Adding or using singletons
* Adding or using static fields or static methods
* Adding or using static initialization blocks
* Adding or using registries
* Adding or using service locators

Flaw #4: Class Does Too Much

Warning Signs

* Summing up what the class does includes the word “and”
* Class would be challenging for new team members to read and quickly “get it”
* Class has fields that are only used in some methods
* Class has static methods that only operate on parameters

08 November 2008

03 November 2008

Hazelcast - sounds similar to Terracotta - distributed Java

* Distributed implementations of java.util.{Queue, Set, List, Map}
* Distributed locks via java.util.concurrency.locks.Lock interface
* Distributed implementation of java.util.concurrent.ExecutorService
* Support for cluster info and membership events
* Dynamic discovery
* Dynamic scaling to hundreds of servers
* Dynamic partitioning with backups
* Dynamic fail-over

A few things you can do with Apache commons-lang, Apache commons-io and Google collections

SuperScheduler - Java scheduling library

Main features of Super Scheduler

* It fully supports distributed computing environment. You can schedule and run tasks on any machine in the network at any time, either on application server machines or on client machines. Many instances of Super Scheduler can run at the same time while a task will run once and only once at the scheduled time. All instances of Super Scheduler share up-to-date information automatically and promptly (default is two seconds).
* It fails-over. If any instance of Super Scheduler is down or busy, others will pick up the tasks.
* It is scalable. if one instance of Super Scheduler is too busy, you can always add another machine to run another instance of Super Scheduler. All instances of Super Scheduler load-balance themselves in a natural way.
* It provides Doer and Talker modes, allowing tasks be fully managed and monitored anywhere in the world.
* It supports many types jobs: Email, FTP, Java, and Operating System. It also supports workflow job, composite job and retry job as well.
* It detects missed tasks and sends alarm email messages.
* It has a built-in GUI tool for scheduling and monitoring tasks. Monitoring windows refresh themselves automatically and promptly. It allows you to suspend tasks, reactive tasks and run tasks immediately.
* It allows you to visually set Workflow job.
* It logs major task related events. It provides history facilities.
* It supports a running period for all tasks.
* A tasks runs on a desired machine if specified.
* All tasks can be programmatically controlled by simple Java API or any other program (like C/C++).
* It provides very flexible schedule terms: minutely, hourly, daily, weekly, monthly, on specific week days and at specified times.
* It provides very comprehensive holiday/weekend policies and facilities providing choices of Next business day, Previous business day and Skip. Any change to holidays will take effect automatically and promptly.
* It provides a lightweight daemon for non graphic environment.
* It provides a Role Based Entitlement management.

26 October 2008

Introduction to nonblocking algorithms

Milton - Java webdav api

Milton is an open-source server-side webdav API written in java. Could be useful, certainly found webdav useful for exposing file-based services through firewalls.

20 October 2008

Access java code in Excel

Obba: A Java Object Handler for Excel | Javalobby
With Obba, you can easily build Excel GUIs to Java code.
Its main features are:

* Loading of arbitrary jar or class files at runtime through an Excel worksheet function.
* Instantiation of Java objects, storing the object reference under a given object label.
* Invocation of methods on objects referenced by their object handle, storing the handle to the result under a given object label.
* Asynchronous method invocation and tools for synchronization, turning your spreadsheet into a multi-threaded calculation tool.
* Serialization and de-serialization (save Serializable objects to a file, restore them any time later).
* All this though spreadsheet functions, without any additional line of code (no VBA needed, no additional Java code needed).

Is Code Coverage Important?

Setup svnsync

02 October 2008

SVNKit - Java api to Subversion

Controlling subversion from java via an api like SVNKit could be quite useful. Never really thought about using it from a typical application but I have written applications that have versioned files, so why reinvent the wheel, just use Subversion via svnkit. An example might be managing changes to configuration files at runtime.

SVNKit is a pure Java toolkit - it implements all Subversion features and provides APIs to work with Subversion working copies, access and manipulate Subversion repositories - everything within your Java application.

SVNKit is written in Java and does not require any additional binaries or native applications. It is portable and there is no need for OS specific code. SVNKit is compatible with the latest version of Subversion.

01 October 2008

Bash parameter expansion

Bash Parameter Expansion

Ensure a database doesnt undo all your hard work in increasing concurrency

Caching, Parallelism and Scalability | Javalobby

So you make your application super scalable... dont just stick a database in the way....

databases that touch disks that inherently involve sequential, serial processing. And there is no way you can change your database vendor’s code. Systems that involve all but the simplest, most infrequent database use would be facing a massive bottleneck thanks to this serialization. Databases are just one common example of process serialization, but there could be others as well. Serialization is the real enemy here, as it undoes any of the throughput gains parallelism has to offer. Serialization would only allow a single core in a multi-core system to work at a time, and limit usable computing power. And this is made even worse in a cluster or grid. Using a bigger and beefier database server reduces this somewhat, but still does not overcome the problem that as you add more servers to your grid, you still have process serialization going on in your database server.

Distributed caches are a solution to this problem.  They are simple to implement where data is immutable (read-only), things become a bit more complex when the data being distributed is mutable.  Ideally the cache dynamically distributes the data across nodes based on runtime usage.

Replacing System.currentTimeMillis() with a Clock interface

I've often not used System.currentTimeMillis() in code, rather I inject an interface into objects that need to know the current time. This interface, lets call it Clock, basically provides a single function getCurrentTime(). Reading Alex Miller - Controlling time I see I'm not alone in this.
This is awesome for unit testing as you can create your own Clock, fix the time, control the rate at which it advances, test rolling over daylight savings time changes, time zones, etc. But all of your code just relies on Clock, which is the standard facade.

... its also very useful event processing systems, you can factor out time-based behaviour, so that the same code can be used for processing historical and realtime events.

Feedback and project management during the software development lifecycle

Alex Miller - Software Rhythm: Mid-Game

Some good comments on feedback during software development

What I have seen is the value of visible public feedback during development. One way to do that is with iterations that provide working features on a regular basis. Another way is to do periodic milestone demos or usability tests or performance testing or whatever makes sense for your project. The most important demo is the first one as you’re probably building something completely wrong and when the user or product manager sees it, they’re going to freak out. That’s ok. In fact, it’s great because you will get plenty of feedback to build the next version.

I think you’ll find that many of our best practices happen to correlate to cycle feedback. Releases get feedback through acceptance testing and actual users. Iterations (or milestones) get feedback by showing work earlier to users and re-planning. Daily builds give feedback by running a suite of tests and reporting on quality. Daily standup meetings give feedback on who’s doing what. Commit sets give feedback by exposing your changes to others and possibly by running commit-time hooks for test suites. Code/test/refactor cycles give you feedback by green bar unit test runs. Post-mortems (or the trendier term “retrospectives”) can give the team feedback about itself at whatever frequency you want.

I’ve come to believe that most process improvements can be tied to a feedback cycle

And on project management:

From a project management point of view, one of the most important things you need to figure out is how you will track the project while you’re in the middle of it.

The most commonly used tool when planning and tracking is the dreaded Gantt chart. I’ve loathed the Gantt chart for a long time but I’ve finally figured out that I hate it because my early exposure to it was using it as a tracking tool. And for tracking the progress of software development, there are few tools worse suited than the Gantt chart. The Gantt displays all tasks as independent and of fixed length. But tasks in the development world tend to be far more malleable, interconnected, and incremental than it’s possible to represent in a Gantt.

Faced with this mismatch, you have two choices: 1) ignore what’s actually happening in the project and approximate it in your rigid schedule or 2) modify the Gantt on a daily basis to reflect the actual state of the project down to the minute. The first will cause developers to ignore the schedule because the schedule seems completely disconnected from reality. The second is madness because the amount of daily churn means the project manager will do nothing else and he will be pestering developers constantly as he does it. These both suck.

For every feature, you should:

* have a list of tasks to be completed (this evolves constantly)
* know the order they need to be done in (but I find this to be intuitive and not necessary to spell out)
* know whether tasks are completed, in progress, or not started
* know who is doing them
* know estimates for each remaining task

There are a bunch of ways to do this kind of tracking during the life of a project. I’ve most commonly used a spreadsheet or plain text file, but have had some success with agile-oriented PM tools like Rally or VersionOne. The point is that for many, many projects you can track this stuff with little work by relaxing your death grip on the initial schedule. 4

You do need to know how much work remains so that you can either adjust end dates or de-scope by dropping or shaving features. You can determine this with: (tasks remaining * estimates) - people time available. If that’s <= 0, you need to take corrective action. I find doing this adjustment on a weekly basis works pretty well and can take less than an hour if you stay on top of it. A burn-down chart is a really nice way to represent this info.

Can't see offshore software development working for any complex project

What does the offshoring backlash tell us? • The Register

Quite funny

After 2 years of excuses, laziness, constant turnover (complete waste of training time when the guy/girl buggers off and leaves you with a new muppet), terrible or copied-from-Google code, never-ending bugs, headaches, baffling phone calls where no-one understood each other, emails that promised to "do the needful" but went ignored, applications that just didn't work, MILLIONS of dollars, and much, much more....... we had enough, and told the Indian coding behemoth we'd had enough and brought our dev team back in house. Saying that things go more smoothly is a massive understatement. Don't know why we bothered. Oh yes, some spreadsheet said it would be cheaper.
This informal communication is completely lost when parts of a project are outsourced. Sending the same spec to another country to be evaluated by a developer who has never met the author and who must route all queries through an account manager just does not work.

Lifehacker's best-of awards for productivity apps

Best of the Best: The Hive Five Winners

Nice to see "pen and paper" winning the best GTD application. Still not happy with the choice of web-based contact management apps, I wish google would pull their finger out and make their contacts app as good as their email and calendar apps.

Some interesting new java annotations in JSR 305

java.net: The Open Road: javax.annotation

Covering annotations such as:






Universal uploader for flickr and google docs and other web services

Smooks looks like a useful java library for processing data files

Milyn - Smooks
Smooks is a Java Framework/Engine for processing XML and non XML data (CSV, EDI, Java etc).

Smooks can be used to:

* Perform a wide range of Data Transforms - XML to XML, CSV to XML, EDI to XML, XML to EDI, XML to CSV, Java to XML, Java to EDI, Java to CSV, Java to Java, XML to Java, EDI to Java etc.
* Populate a Java Object Model from a data source (CSV, EDI, XML, Java etc). Populated object models can be used as a transformation result itself, or can be used by (e.g.) Templating resources for generating XML or other character based results. Also supports Virtual Object Models (Maps and Lists of typed data), which can be used by EL and Templating functionality.
* Process huge messages (GBs) - Split, Transform and Route message fragments to JMS, File, Database etc destinations.
* Enrich a message with data from a Database, or other Datasources.
* Perform Extract Transform Load (ETL) operations by leveraging Smooks' Transformation, Routing and Persistence functionality.

Smooks supports both DOM and SAX processing models, but adds a more "code friendly" layer on top of them. It allows you to plug in your own "ContentHandler" implementations (written in Java or Groovy), or reuse the many existing handlers.

08 September 2008

My routers bigger than yours

Cisco 7600 OSR Backbone Router
The Cisco 7600 OSR consists of a 256 Gbps switching fabric and a 30 million packets per second (mpps) forwarding engine.

JavaPerformanceTuning points on java system time

Tips July 2008
# System.currentTimeMillis can have resolution times as bad as 60ms on some systems, and in any case resolution is very system dependent. Don't use it to try to reliably measure short times.
# System.nanoTime returns the number of nanoseconds since some arbitrary offset. It is useful for differential time measurements; Its accuracy and precision should never be worse than System.currentTimeMillis; it can deliver accuracy and precision in the microsecond range on many modern systems.
# ThreadMXBean.getCurrentThreadCpuTime offers the possibility of measuring CPU time used by the current thread. But it may not be available, and may have significant overheads. It should be used carefully.
# On Windows, System.nanoTime involves an OS call that executes in microseconds, so it should not be called more than once every 100 microseconds or so to keep the measurement impact under 1 percent.

Avoid NIO, Get Better Throughput?

Avoid NIO, Get Better Throughput | Javalobby
However, in most instances, maybe non-blocking I/O is not necessary at all? In fact, maybe it is detrimental to performance?... The characteristics of JVMs and threading libraries change as new advances are made. Good advice often becomes bad advice over time... Paul’s experiments show that higher throughput is achieved with blocking I/O, that thread-per-socket designs actually scale well, and that the costs of context-switching and synchronisation aren’t always significant.

Hmmmmm, I guess its worth trying both? Hopefully using existing libraries for network applications allows you to do this simply.

03 September 2008

Crazy software story

The Graphing Calculator Story

I love the quote
the first 90 percent of the work is easy, the second 90 percent wears you down, and the last 90 percent - the attention to detail - makes a good product

17 August 2008

Quickly change the case of file suffixes

for f in *.JPG; do mv $f `basename $f .JPG`.jpg; done;

Be careful to run this command in the directory containing the files.

09 August 2008

Collective Stupidity

Thought this was quite funny article on the problems of decision-making in large groups.

Collective Stupidity
The natural tendency may be just to make decisions that are acceptable and least-common-denominator; decisions that no one disagrees with -- because it's easier and less contentious. But this doesn't seem to be the path to success.
The company that avoids collective stupidity is a statistical anomaly. Something about the way we do and think about decisions -- something so fundamental that we do it below the threshold of consciousness -- builds in this inevitable tendency towards the non-innovative, non-creative, non-fun organization.

Change management for databases

Automation for the people: Hands-free database migration
Databases are often out of sync with the applications they support, and getting the database and data into a known state is a significant challenge to manage. In this installment of Automation for the people, automation expert Paul Duvall demonstrates how the open source LiquiBase database-migration tool can reduce the pain of managing the constant of change with databases and applications.

Floating Point Math in Bash

Object State

Tim Boudreau's Blog: Where's the state?

Where's the state? This is a small but useful question when deciding how a problem domain gets carved up into objects: What things have state? What things have values that can change? When and how can they change? Can the changes be observed? Who needs to observe changes in state?
Any time you find yourself writing a setter (or more generally, mutator) method, it is worth asking yourself Who does this piece of state really belong to? Am I putting state in the right place? Blindly following the beans pattern (add a getter/setter to whatever type you happen to be editing) can result in state being distributed all over objects in an application. The result will be harder to maintain, and if it has any lifespan, less and less maintainable over time. Like entropy, or the no broken windows theory, the more state is distributed all over the place, the more that will masquerade as design and make it okay for a programmer to further diffuse state throughout objects in the application. Human beings can only hold so many things in their minds at a time. The more state is diffused, the more things someone must pay attention to simultaneously to solve a problem in that code.
Another question above was Who needs to observe changes in state? There are some corollaries: Where should those changes be published? and Does the thing observing the change need a reference to the thing that changed?
While rare, the listener pattern is prone to ordering bugs which are quite painful to reproduce and fix. Additionally, in my experience, listeners are the most potent source of memory leaks in Swing applications. So it is a pattern to avoid unless you know you really need it.

03 August 2008

Java wishlist - generics done properly, without erasure

I like generics alot, but the current implementation can be very frustrating at times, due to erasure. The are many instances where I wish I could refer to the "generic type" and use it to create an array of that type.

29 July 2008

Querying/Manipulating database schema using MetaModel

Quickly scanned this libarary, MetaModel, looks like it provides a useful domain model for databases, e.g. Database, Schema, Table, Column, Index etc. Its a pain querying/manipulating this information from JDBC. It also abstract different types of database, e.g. sql, csv.

Top 10 Java Performance Troubleshooting Tools | Javalobby

28 July 2008

Del.icio.us rss feed limited at 15

I use del.icio.us for bookmarking and have a number of tags as Firefox live bookmarks. E.g:


Recently I found that each tag was only displaying the last 15 entries, it looks like del.icio.us have reduced the default size for rss feeds. To get round this just change the live bookmark url to include "?count=maxNumberOfEntries", e.g:


21 July 2008

Java wishlist - disable autoboxing

Java wishlist - disable foreach on lists

The foreach loop will iterate a list using an iterator rather than by get(int) access. This is less efficient when called many times in tight loops. I wish I could easily detect when foreach has been used on a list.

Java wishlist - error on number overflow

I've often worried about numbers overflowing in my code, whether longs, ints or doubles. I would like to force an error to be thrown if any numbers overflow in my code. I would need to be able to set this on a class or package basis, so that I can still allow libraries that are included in my code to overflow if they wish.

Java wishlist - immutable arrays

Over the last x years I often found myself wishing for various features, thought I'd get them down in written form so I don't forget. As they come back to me I will jot them down.

An immutable array would throw an exception if anyone tried to change the reference at any one of its elements. This would allow one to return the array from a method invocation without having to defensively copy it. Of course, since Java 5, with generics, you can return typed lists that throw an error if modified.

18 July 2008

Java Native Access (JNA): Pure Java access to native libraries

jna: Java Native Access (JNA): Pure Java access to native libraries
JNA provides Java programs easy access to native shared libraries (DLLs on Windows) without writing anything but Java code--no JNI or native code is required. This functionality is comparable to Windows' Platform/Invoke and Python's ctypes. Access is dynamic at runtime without code generation. JNA's design aims to provide native access in a natural way with a minimum of effort. No boilerplate or generated code is required. While some attention is paid to performance, correctness and ease of use take priority.

04 July 2008

FIX message dictionary

FIX Dictionary from Onix Solutions is an excellent resource that describes each message type in detail.

Motivations when writing tests

Wasn't much for me to read in My Unit Tests Are Purer Than Yours, but for an excellent quote which points out that integration tests are probably more important the unit tests:
And so, if we heed the saying "a good test is one that finds a defect", then where are we going to get the most bang for our buck? Testing very simple rules in isolation? Or testing this extremely complex interaction of rules within the framework. Obviously the latter is exponentially more likely to turn up issues and thus make for the better test!

Robust Java benchmarking

Brent Boyer has written a comprehensive set of articles on benchmarking Java code.

Robust Java benchmarking, Part 1: Issues, Understand the pitfalls of benchmarking Java code

Robust Java benchmarking, Part 2: Statistics and solutions, Introducing a ready-to-run software benchmarking framework

Understanding the closures debate

28 June 2008

A nice firefox addon for using mouse gestures

I've updated my firefox addon list following moving to Firefox 3. I've started using FireGestures for mouse gestures, you hold down the right mouse button and move the mouse up, down, left or right to execute various actions. The default configuration has far too many options to figure out, I've simplified mine to just a few:

Previous TabL
Next TabR
Close TabRL
Open Link in BackgroundD
Open Link in ForegroundU
Reload Skip CacheUDU
Popup Recently Closed TabsLR

Visualising jgrapht graphs using yEd

If you need to visualise graphs, such as directed acyclic graphs, then yEd is a nice open-source tools to use. Save the graph to gml as show below. Then open up yEd, you can open it directly from yWorks website using webstart. For DAGs, I've found "hierarchical - interactive" view with "hierarchical - topmost" layout the best way to layout the graph.

import org.jgrapht.DirectedGraph;
import org.jgrapht.ext.GmlExporter;
import org.jgrapht.graph.DefaultDirectedGraph;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.Writer;
DirectedGraph graph = new DefaultDirectedGraph(...);
GmlExporter gmlExporter = new GmlExporter();
File file = ...
Writer fileWriter = new BufferedWriter(new FileWriter(file));
gmlExporter.export(fileWriter, graph);

Apply svn auto-properties retrospectively

A useful script that applies all auto-properties (from $HOME/.subversion/config) to your current workspace files.

24 June 2008

Never been a fan of BDUF - Big Design Upfront

Top 10 SOA Pitfalls: #5 - Big Design Up Front | Javalobby

Never been a fan of BDUF - Big Design Upfront... ever since (early on in my career) I was "required" to create endless class and sequence diagrams before being able to write a line of code... consultancies used to get paid by how much documentation they created rather than working software.

ZeroTurnaround's JavaRebel - class reloading

ZeroTurnaround.com JavaRebel

If you're developing an application that has long reload times and Hotspot's Hotswap is not cutting the mustard, then this product maybe what you need. As well as supporting modifications to methods it also supports adding/removing methods/fields/clases/annotations and changing interfaces.

Checkout the screencast.

Jolt Awards

Jolt Awards

I always used to keep an eye on the Jolt Awards, back in the day when Software Development Magazine (now merged with Dr Dobbs Journal) used to cover it. Sometimes useful to see what products are getting awards... wouldn't base any decisions on it though.

Kid's swimming centre in Wimbledon

Centos 5.2 went smoothly

[CentOS-announce] Release for CentOS-5.2 i386 and x86_64

The upgrade went through without a problem.

1) execute "yum upgrade"
2) execute "find /etc -name *.rpm?*" to discover any .rpmnew and .rpmsave config files that need merging into existing config files, then merge, preferable using a visual merge tool such as Meld
3) reboot

Job done

Centos repositories - adding RPMforge

18 June 2008

Enforcing package dependencies

For sometime I've wanted a tool that would let me enforce package dependencies (directed acyclic graphs). Currently I can only do this in IntelliJ IDEA, or Maven, by splitting my packages out into separate modules. However this leads to too many modules. Googling has brought up two possibilities.

Structure101 looks awesome but costs a lot, still it looks like a valuable tool for a commercial development team. Its a server-based tool that comes with IDE plugins.

Japan and Macker seem like cheap and cheerful open-source tools that are simple to use, as they rely on a configuration file that can be placed into source control alongside the actual source code to which it relates. I think I'll give Japan a go, as its simple enough for my needs. Andy@Gigantiq discusses them in detail at:

Enforcing Package Dependencies (Part 4)
Enforcing Package Dependencies (Part 3)
Enforcing Package Dependencies (Part 2)
Enforcing Package Dependencies (Part 1)

13 June 2008

Segregating unit and integration tests in Maven

Unit tests are not integration tests - John Ferguson Smart's Blog

Shoal - a useful library for implenting clustering functionality

Shoal – A Dynamic Clustering Framework

Shoal is a java based scalable dynamic clustering framework that provides infrastructure to build fault tolerance, reliability and availability.

The framework can be plugged into any product needing clustering and related distributed systems capabilities without tightly binding to a specific communications infrastructure.

A good discussion of exceptions

Exceptions in Java by Dr. Heinz M. Kabutz

10 June 2008

IEEE1588 – Precision Time Protocol

Measurement and automation applications (and latency-sensitive electronic trading systems) often must precisely synchronize events over a widely distributed system. For example, an industrial automation system may need to synchronize distributed motion controllers, or a test and measurement system may need to measure strain gages distributed over the wing span of an airplane (or an algorithmic trading system may need to decide whether its market data is too old and monitor latency on an ongoing basis). The IEEE 1588 precision time protocol (PTP) provides a standard method to synchronize devices on a network with submicrosecond precision. The protocol synchronizes slave clocks to a master clock ensuring that events and timestamps in all devices use the same time base. IEEE 1588 is optimized for user-administered, distributed systems; minimal use of network bandwidth; and low processing overhead.

Read more

Get your gmail as an rss feed

Get your gmail as an rss feed so you can see it in your news reader.

Why it's better to be lazy, by Jeremy Meyer

Why it's better to be lazy, by Jeremy Meyer

Everyone can be loosely classified into 4 groups, which are permutations of either activeness or laziness and , intelligence or stupidity.

Intelligent Lazy people, as my father would explain, are everyone's favourite. They do things in a smart way in order to expend the least effort. They don't rush into things, taking that little bit of extra time to think and find the shortest, best path. They tend to make what they do repeatable so they don't have to go through it all again. These people usually make good leaders, and they are good to have on teams.

Intelligent Active people are useful, because they are smart, after all, but their intelligence can be slightly diluted or tempered by their activity. Where they don't have a perfect solution, they might act anyway, rather than sitting on their laurels, so useful people but not great leaders, and on a team, they can often put noses out of joint by acting too early and too fast.

Stupid Lazy people have their place too, they are easy to manage, they generally don't act on their own initiative too much and, given tasks that are not beyond them, they will perform in a predictable, consistent manner. Usually they won't cause any harm on teams. Wellington, liked these people as foot soldiers.

Stupid Active people are the dangerous ones. Their default behaviour is to act, and in the absence of skill or thought they can cause all types of trouble.

07 June 2008

Figuring out what subversion checkins you've done

"svn log" will give you a changelog for everything that's happened on a subversion repository between particular versions, but what if you want to filter this out for checkins done by a particular author, such as yourself? Run:

svn log -r {2008-01-01}:{2008-06-08} --xml --username your_repo_name --password your_repo_password http://svn.domain.net/repos/xxx > changelog_for_all_authors.xml

Now you need to run a little filtering on changelog_for_all_authors.xml, some form of xpath might help. I came across a script at http://dev.agoraproduction.com/svnLogParser.php that does the job. Download the script to svn-log-parser.php and run.:

php ./svn-log-parser.php changelog_for_all_authors.xml your_repo_name change_log_for_your_repo_name.xml

Searching for RHEL and CentOS information

06 June 2008

Date logic in Bash

You can generate strings that include relative date values using a script like:

for n in 20 19 18 17 16 15 14 13 12 11;
d=`date +'%Y.%m.%d' -d "-${n} day"`
echo "Deleting ${d}"
rm -rf ./${d}

31 May 2008

How expensive is thread context switching

Time to overcome Java misconceptions

In the Linux 2.6 NPTL library, which is where Tyma and Manson ran their tests, context switching was not expensive at all. Even with a thousand threads competing for Core Duo CPU cycles, the context-switching wasn't even noticeable.

Frameworks and the danger of a grand design

Frameworks and the danger of a grand design

This isn't really something that can be taught, at least not in such a way that it really drums the lesson home. You have to experience the pain of creating an over-designed "framework for the sake of it", believing in all good faith that your grand design will help, not hinder, maintainability.

The grand design mindset isn't just the application of an anti-pattern, or even just the inappropriate use of a normally well-behaved design pattern. The mindset is the overuse of patterns, carefully cementing nano-thin layers of indirection atop each other like a process in a chip fabrication plant that can't be shut down. It's the naïve belief that if X is good then 100X must be better every time.

Luckily this mindset is easy to spot. Your team members will be busy creating a beautiful but over-designed system with enums, annotiations, closures and all the latest language features, loosely coupled classes and several hundred pluggable frameworks when a well-placed isThisTheRightValue() method would probably have sufficed.

Picture a pluggable framework that only ever has one plug. You'll see a comment in the code like, "Later this could be applied to other parts of the system."

If there's neither time nor a compelling reason to apply the pluggable framework to other parts of the system in this iteration, then it's likely the programmer going off on a design pattern hike - exploring their talents - and the framework should be struck out of the code. It's just more complexity to maintain.

Of course, some programmers will never learn: they like writing code too much. Lots of it, as if they're paid in lines of code, or reviewed that way. Or maybe it's a macho thing. Pursuing your own grand design will do that for you, but it's better to solve problems with little code. Less is more.

An alternative to JOTM for JTA

Atomikos TransactionsEssentials

Reducing Coupling Through Unit Tests

Reducing Coupling Through Unit Tests

Interesting article:

Low coupling refers to a relationship in which one module interacts with another module through a stable interface and does not need to be concerned with the other module's internal implementation.

Unit testing under any name is a good test of the ability to call your code in isolation.

If you have written code with low coupling, it should be easy to unit test. If you have written code with high degrees of coupling, you're likely to be in for a world of pain as you try to shoehorn the code into your test harness.

Is the code highly cohesive? That is, does each module carry a single, reasonably simple responsibility, and is all the code with the same responsibility combined in a single module? If code implementing a single feature of your application is littered all over the place, or if your methods and classes try to do many different things, you almost invariable end up with a lot of coupling between them, so code with low cohesion is a big red flag alerting you to the likelihood of high coupling as well.

10 May 2008

Possibly a useful static analysis tool that doesn't spam you with issues

John Ferguson Smart's Blog: JavaOne 2008 - FindBugs is a great little tool, and it just keeps getting better!
There are a lot of static analysis tools out there, but Findbugs is unique. Where Checkstyle will raise 500 issues, and PMD 100, FindBugs will only raise 10 - but you damn well better look at them carefully!

That is a slight over-simplification, but it does reflect the philosophy of FindBugs. FingBugs uses more sophisticated analysis techniques than tools like PMD and Checkstyle, working at the bytecode level rather than with the source code, and is more focused on finding the most high priority and potentially dangerous issues.

Can Redhat's realtime kernel reduce latency?

Low-Latency.com » Blog Archive » Open Debate on Latency Numbers is ‘Healthy’, But Just How Fast is Fast? By Simon Garland, Chief Strategist at Kx Systems
An institution may be running the fastest and most efficient feed handlers, but its servers may be running a standard Linux distribution. We’ve seen an improvement in a number of cases when a switch is made to, for example, the real-time version of Red Hat. It’s not as though an institution needs to have its private team of kernel hackers; today one gets pretty standard distributions, which can make a significant difference.

Java to get better runtime optimisations

JavaOne: AMD Keynote | Javalobby
Two features were given the spotlight to show how “Java is moving closer to the bare metal as it evolves”. By that AMD means the JVM will be using an increasing number of instruction sets and features for tuning purposes going forward and we could ultimately see the JVM being run directly on a hypervisor, sans operating system. The features are:

* Light-weight profiling (LWP): Designed to improve software parallelism through new hardware features in future versions of AMD processors. It will allow technologies like Java to more easily benefit from the multi-core processors that are now being designed and deployed.

* Advanced Synchronization Facility (ASF): Created to increase concurrency performance and introduces hardware read barriers to help with Garbage Collection.

Tool for tracing application behaviour for latency analysis - another tool

BTrace looks interesting, a bit like dTrace but works with Java and doesnt need Solaris

03 May 2008

ssh tunnels

Say you have machineX which can access an machineZ:8888 and machineY which cant access machineZ:8888. Yet you want to access machineZ:8888 from machineY. You can run the following on machineY:

userA@machineY> ssh -N -f -L 8888:machineZ:8888 userA@machineX

Now, on machineY you can connect to localhost:8888 and automatically be tunnelled to machineZ via machineX.

26 April 2008

Tool for tracing application behaviour for latency analysis

Recently I did some work to determine were latency was being added for a particular use-case. Luckily, I was able to builb my testcase by subclassing every part I wanted to instrument to add instrumentation. The instrumentation was simple, placing timestamps into a hashmap, keyed by the name of the location in the code. Obviously this is not the best way of doing this. You don't always have the opportunity to subclass the most important locations. More flexible solutions would be:

1) JInspired's JXInsight
2) An aspect-orientated library or standard Java instrumentation to instrument every method
3) DTrace with Java

23 April 2008

latest vm parameters I use


07 April 2008

Set heap size for maven

set MAVEN_OPTS=-Xmx384M

26 March 2008

Log level philosophy

The level of severity at which you log a particular line probably doesnt get a great deal of attention. Points of contention are probably whether something is an error or warning, or whether it is an info or debug. Here Alex Miller discusses his thoughts on the subject.

Some comments on the article:

"DEBUG should be used only during development for debugging."... I've seen a lot of systems that always run in debug.

The article doesnt discuss fatal errors, I use this level of logging as a way of deciding that the log line should be sent to a support team as an email or sms. Although a more powerful solution is a system that watches log files and allows certain (regex) patterns to be matched against each log line, with rules kicking in deciding what should be done, e.g. send sms, stop system etc.

Also I do get fustrated when I see logging that doesnt include much detail, e.g. "Failed to submit order"... how about logging the details of the order too, this helps debugging and log forensics no end.

05 March 2008

Note to self: a good fence fitter

Nicholas McCarthy did a very good job at fitting a new fence panel in my back garden.

Contact: 07958 733 554 and 0208 543 4909

23 February 2008

A good place to buy media centre PCs

Glow Technology ... if your okay with Vista Media Center. E.g. check out http://www.glowtechnology.co.uk/product_info.php?cPath=2_15&products_id=47. Now if only the XBMC guys get XBMC ported to a linux platform I wont have to go down the Vista Media Center path.

Guide to Codecs in Vista Media Center

Recommends Haali Media Splitter and ffdshow

19 February 2008

Deploying maven snapshots to a remote repository

mvn install deploy:deploy -DaltDeploymentRepository=snapshots::default::dav:http://bats-dev-2.intranet.barcapint.com:8089/archiva/repository/snapshots

The altDeploymentRepository argument specifies an alternative repository to deploy snapshots to with format id::layout::url

16 February 2008

Flushing the DNS cache in Windows Vista

Here is how to flush the DNS cache in vista.

1. Click the Microsoft Vista Start logo in the bottom left corner of the screen
2. Click All Programs
3. Click Accessories
4. Right-click on Command Prompt
5. Select Run As Administrator
6. In the command window type the following and then hit enter: ipconfig /flushdns

15 February 2008

Blodgett's Law

A development process that involves any amount of tedium will eventually be done poorly or not at all.

11 February 2008

Show me the money, Martin Fowler explains why good developers are worth a lot more


One of the commonly accepted beliefs in the software world is that talented programmers are more productive. Since we CannotMeasureProductivity this is a belief that cannot be proven, but it seems reasonable. After all just about every human endeavor shows some people better than others, often markedly so. It's also commonly observed by programmers themselves, although it always seems to be remarked on by those who consider themselves to be in the better talented category.

Naturally better programmers cost more, either as full-time hires or in contracting. But the interesting question is, despite this, are more expensive programmers actually cheaper?

On the face of it, this seems a silly question. How can a more expensive resource end up being cheaper? The trick, as it is so often, is to think about the broader picture of cost and value.

Although the technorati generally agree that talented programmers are more productive than the average, the impossibility of measurement means they cannot come up with an actual figure. So let's invent one for argument sake: 2. If you can find a factor-2 talented programmer for less than twice of the salary of an average programmer - then that programmer ends up being cheaper. To state this more generally: If the cost premium for a more productive developer is less than the higher productivity of that developer, then it's cheaper to hire the more expensive developer. The cheaper talent hypothesis is that the cost premium is indeed less, and thus it's cheaper to hire more productive developers even if they are more expensive.

In case anyone hasn't noticed this hypothesis is a key part of our philosophy at ThoughtWorks and is one of the main reasons why I ended up switching from an independent consultant to join. We believe we actually end up cheaper for our clients, even though our rates were higher. Of course, we do have difficulty persuading many clients that this is true - that lack of objective productivity measures strikes again. I still remember a meeting with one prospective client complaining about how our rates were higher than a company who had made a previous, failed, attempt at the system we were bidding on. We had to politely point out that paying less rates for a project that delivered no value was hardly a financially prudent strategy.

There are some notable consequences to the the cheaper talent hypothesis. Most notably is one that it actually follows a positive scaling effect - the bigger the team the bigger the benefits of cheaper talent. Let's assume we actually have put together a team of ten talented developers to run a project in some alternative universe where we have actually measures that they are twice as productive as the average - and thus do cost exactly twice as much to hire. In this case you might naturally assume that a rival team of average programmers would be a team of twenty.

The trouble is that that assumption assumes productivity scales linearly with team size, which again observation indicates isn't the case. Software development depends very much on communication between team members. The biggest issue on software teams is making sure everyone understands what everyone else is doing. As a result productivity scales a good bit less than linearly with team size. As usual we have no clear measure, but I'm inclined to guess at it being closer to the square root. If we use my evidence-free guess as the basis then to get double the productivity we need to quadruple the team size. So our average talent team needs to have forty people to match our ten talented people - at which point it costs twice as much.

Another factor that plays a role here is time-to-market. Let's assume two teams of four people, one talented and one average. To stack the deck of our argument against our talented team, discount the previous paragraphs, and assume the talented team is only twice as productive as the average team. If the talented team charges twice as much then can we assume that it doesn't matter financially which team we pick?

I'm afraid the talented team wins again. They'll complete the project in half of the time of the average team, which means that the customer will start yielding value from the delivered software earlier. This earlier value, compounded by the time value of money, represents a financial gain for picking the talented team, even thought their cost per output is the same.

Agile development further accelerates this effect. A talented team has a faster cycle time than an average team. This allows the full team to explore options faster: building, evaluating, optimizing. This accelerates producing better software, thus generating higher value. This compounds the time-to-market effect. (And it's natural to assume that a talented team is more likely to produce better software in any case.)

Faster cycle time leads to a better external product, but perhaps the greatest contribution a talented team can make is to produce software with greater internal quality. It strikes to me that the productivity difference between a talented programmer and an average programmer is probably less than the productivity difference between a good code-base and an average code-base. Since talented programmer tend to produce good code-bases, this implies that the productivity advantages compound over time due to internal quality too.

All this sounds, at least to me, like a highly compelling argument. It's also one that's widely accepted (at least by programmers who consider themselves talented). But it's far off being accepted by the software industry as a whole. We can tell this because the premium for talented developers (in terms of salary/contracting fees) is less than the productivity difference. Probably the major reason for this the inability to objectively measure productivity. A hirer cannot have objective proof that a more expensive programmer is actually more productive. Only the higher cost is objective. As a result a hirer has to match a subjective judgment of higher value against an objective higher cost. Many hirers, even if they believe the talented programmer is worthwhile personally, isn't prepared to justify the full higher cost to managers, HR, and purchasing.

This effect is compounded by the difficulty in making even a subjective assessment. At ThoughtWorks we rely on peer assessment - developers abilities are assessed by fellow team members. The result is hardly pinpoint precision, but it's the best anyone can do.

Which all points out that hiring and retaining talented programmers is hard work. Hiring and assessment is hard work. You have to deal with people with very individual desires, which are even more important to track as they are effectively underpaid. So a hirer is faced with certain extra work and higher costs versus only a judgment call for higher productivity.

So I understand the situation but don't accept it. I believe that if the software industry is to fulfill its potential it needs to recognize the cheaper talent hypothesis and close the gap between high productivity and higher compensation.

04 February 2008

Write less code

Excerpt from The programmer who programs least, programs best.

So, how can we mitigate this issue? The obvious way is to write less code! Now, I'm not saying that you should be lazy and start programming less, I am saying that every time you add code to your application you should do so consciously. While you are adding code you should be conscious of whether you are repeating functionality, because any time that you can reduce the amount of code you are better off. Just slow down and think about what you are doing. In the software world our schedules can be crazy and in the haste to get the code written and the application out the door we can cut short the parts of software development that actually lead to good applications.

Another thing you can do is to constantly look for code to refactor in order to shorten, simplify, or completely remove it. Constantly ask yourself if a piece of code needs to be there or not. Refactoring can be troublesome though, and if you do not have sufficient tests to check your refactorings then you may end up introducing more bugs than you are avoiding.

One last thing you can do is to avoid large or complex methods and classes. You may think that bugs are distributed evenly around an application, but that just isn't true. According to Steve McConnell one study found that "Eighty percent of the errors are found in 20 percent of a project's classes or routines" and another found that "Fifty percent of the errors are found in 5 percent of a project's classes". The more large and complex a method or routine is, the more likely it is to contain errors and the harder those errors will be to fix.

Chamonix accommodation

Will be staying at Androsace Richardson between Feb 27 and Mar 2nd 2008

Ski holiday websites

A collection of websites that I looked at: