22 December 2009

That Rotting Design | Javalobby

That Rotting Design | Javalobby
We’ve all experienced that sinking feeling when maintaining a piece of crappy software. Has my change broken the system in some unintended way? What is the ramification of my change on other parts of the system? If you’re lucky, and the system has a robust suite of unit tests, they can offer some support in proving your work. In practice, however, few systems have thorough automated test coverage. Mostly we’re alone, left to verify our changes as best as possible. We might privately criticize the original developers for creating such garbage. It certainly lends a plausble excuse in explaining why the maintenance effort is so costly or time-consuming. Or it might serve as the basis upon which we recommend a re-write. But mostly, we should wonder how it happened.

For sure, most software doesn’t start out this way. Most software starts out clean, with a clear design strategy. But as the system grows over time, strange things begin to happen. Business rules change. Deadline pressures mount. Test coverage slips. Refactoring is a forgotten luxury. And the inherent flaws present in every initial design begin to surface. Reality has proven that few enterprise development teams have the time or resources to fix a broken design. More often, we are left to work within the constraints of the original design. As change continues, our compromises exacerbate the problem. The consequence of rotting design is seen throughout the enterprise on a daily basis. Most apparent is the affect on software maintenance. But rotting design leads to buggy software and performance degradation, as well. Over time, at least a portion of every enterprise software system experiences the problem of rotting design. A quote from Brook’s sums it well:

"All repairs tend to destroy the structure, to increase the entropy and disorder of the system. Less and less effort is spent on fixing the original design flaws; more and more is spent on fixing flaws introduced by earlier fixes. As time passes, the system becomes less and less well-ordered. Sooner or later the fixing ceases to gain any ground. Each forward step is matched by a backward one. Although in principle usable forever, the system has worn out as a base for progress."

The most obvious question is, “How do we prevent rotting design?” Unfortunately, rotting design is not preventable, only reducable. Of the past ten years, the design patterns movement has provided insight to the qualities of good design. Dissecting design patterns reveals many important design principles that contribute to more resilient software design. Favor object composition over class inheritance, and program to an interface, not an implementation, are two examples. Alone however, design patterns that emphasize class structure are not enough to help reduce rotting design.

Reducing Rot

Most patterns emphasize class design, and present techniques that can be used in specific contexts to minimize dependencies between classes. Teasing apart the underlying goal of most patterns shows us that each aim to manage the dependencies between classes through abstract coupling. Conceptually, classes with the fewest dependencies are highly reusable, extensible, and testable. The greatest influence in reducing design rot is minimizing unnecessary dependencies. Yet enterprise development involves creating many more entities beyond only classes. Teams must define the package structure in which those classes live, and the module structure in which they are deployed. Increasing the survivability of your design involves managing dependencies between all software entities - classes, packages, and modules.

But if minimal dependencies were the only traits of great design, developers would lean towards creating very heavy, self-contained software entities with a rich API. While these entities might have minimal dependencies, extreme attempts to minimize dependencies results in excessive redundancy across entities with each providing its own built-in implementation of common behavior. Ironically, avoiding redundant implementations, thereby maximizing reuse, requires that we delegate to external entities, increasing dependencies. Attempts to maximize reuse results in excessive dependencies and attempts to minimize dependencies results in excessive redundancy. Neither is ideal, and a gentle balance must be sought when defining the behavior, or granularity, of all software entities - classes, packages, and modules. For more on the use/reuse paradox, see Reuse: Is the Dream Dead?

Software design is in a constant quandary. Any single element key to crafting great designs, if taken to its individual extreme, results in directly the opposite - a brittle design. The essential complexity surrounding design is different for every software development effort. The ideal design for a software system is always the product of it’s current set of behavioral specifications. As behavior changes, so too must the granularity of the software entities and the dependencies between them. The most successful designs are not characterized by their initial brilliance, but instead through their ability to withstand the test, and evolve over the course, of time. As the complexity of software design is an essential complexity surrounding software development, our hopes lie with technologies and principles that help increase the ability of your design to survive. Such is the reason why agile architecture is so important.

A Promising Future

I’m hopeful that all software developers have experienced the pleasure of a design that, through the course of time, has withstood the test of time. Unfortunately, many enterprise development teams have too few of these experiences. Likewise, few enterprise development teams devote adequate effort to package and module design. It’s unreasonable to believe that even the most flexible class structure can survive should the higher level software entities containing those classes not exhibit similarily flexible qualities. The problems are rampant. Increased dependencies between packages and modules inhibit reusability, hinder maintenance, prevent extensibility, restrict testability, and limit a developer’s ability to understand the ramification of change.

Services offer some promise to remedy our failures with object-oriented development. Yet, while services may offer tangible business value, within each awaits a rotting design. There exists a world between class design and web services that deserves more exploration, and as an industry, we are beginning to notice. OSGi is a proven module system for the Java platform, while Jigsaw aims to modularize the JDK. JSR-294 aims to improve modularity on the Java platform. While some friction might exist between the constituencies involved, it’s only because they too recognize that something has been missing, and are passionate about fixing the problem. Of course, it doesn’t stop there. A plethora of application servers and tools are also including support for modularity using OSGi, which has grown into the defacto standard module system on the Java platform.

All aim to help manage the complexity, from design through deployment, of enterprise software development. With each, new practices, heuristics, and patterns will emerge that increase the ability of a design to grow and adapt.

20 December 2009

Coding: An Outside in Observation | Javalobby

Coding: An Outside in Observation | Javalobby
it's important to drive the design of an API based on the way that it will be used by its clients:

"A great way to get usable APIs is to let the customer (namely, the caller) write the function signature, and to give that signature to a programmer to implement. This step alone eliminates at least half of poor APIs: too often, the implementers of APIs never use their own creations, with disastrous consequences for usability"

This is similar to Michael Feathers' Golden Rule of API Design:

"It's not enough to write tests for an API you develop, you have to write unit tests for code that uses your API.
When you do, you learn first-hand the hurdles that your users will have to overcome when they try to test their code independently"

When we don't do this we start guessing how we think things might be used and we end up with generic solutions which solve many potential use cases in an average to poor way and none of them in a good way.

Why Is Agile So Hard To Sell? | Agile Zone

Why Is Agile So Hard To Sell? | Agile Zone
Now we have pockets of agile within large organizations... sometimes we might even have agile across entire large enterprises. We are exploring agile methods and trying to see if they can deliver on the small team promise... but in the large. The main difference with these larger organizations is that value isn't the same as it is in a small team or a small company. There is not a direct correlation between team performance and business outcomes... there is not a direct connection between what the team delivers and what we can sell to our customers. It takes the output of too many small teams coming together to deliver anything of value.

Coding: Naming | Javalobby

Coding: Naming | Javalobby
...the importance of creating a shared understanding of the different types/objects in the systems that we build.

Names are there to make it easier for us to talk and reason about
code – if they're not doing that then we need to change them until they

"Agile is treating the symptoms, not the disease" | Agile Zone

"Agile is treating the symptoms, not the disease" | Agile Zone
how does an agile process reduce the complexity load? And the answer, of course, is that it doesn't—it simply tries to muddle through as best it can, by doing all of the things that developers need to be doing: gathering as much feedback from every corner of their world as they can, through tests, customer interaction, and frequent releases. All of which is good. I'm not here to suggest that we should all give up agile and immediately go back to waterfall and Big Design Up Front. Anybody who uses Billy's quote as a sound bite to suggest that is a subversive and a terrorist and should have their arguments refuted with extreme prejudice.

But agile is not going to reduce the technology complexity load, which is the root cause of the problem.

19 December 2009

Java Performance Tuning Tips Nov 2009

Tips November 2009
eBay?s Challenges and Lessons (Page last updated October 2009, Added 2009-11-30, Author Randy Shoup, Publisher eBay). Tips:

* Partition everything so that you can scale it as necessary.
* Use asynchrony everywhere, connected through queues.
* Use recovery and reconciliation rather than synchronous transactions.
* Automate everything - components should automatically adjust and the system should learn and improve itself.
* Monitor everything, providing service failover for parts that fail.
* Avoid distributed transactions.
* Design for extensibility, deploy changes incrementally.
* Minimize and control dependencies, use abstract interfaces and virtualization, components should have an SLA.
* Save all (monitoring) data - this provides optimization opportunities, predictions, recommendations.
* Maximize the utilization of every resource.

Performance Considerations in Distributed Applications (Page last updated September 2009, Added 2009-11-30, Author Alois Reitbauer, Publisher JavaLobby). Tips:

* Many distributed application performance issues stem inefficient serialization of objects (to send across the wire).
* Connection pooling can be an important performance improvement for many types of distributed communications (not just database communications).
* Asynchronous communication is more difficult to design and implement, but is usually more efficient for distributed communications.
* Making too many service calls from the client is a frequent source of performance problems.
* Using a more efficient communication protocol can make a huge difference to overall performance (Wrong Protocol Anti-pattern) - the overhead of SOAP compared to RMI-JRMP is significant. An inefficient protocol can cause a performance degradation by a factor of ten and significantly higher CPU and memory consumption.
* When building distributed applications, make as few remote calls as possible (Chatty Application Anti-pattern).
* Keep message size as small as possible, for example the overhead of serialisation is proportional to the size of the transferred objects (Big Messages Anti-pattern).
* With a high number of deployment units, the interaction of services become more and more difficult to understand. This can lead to a situation where two heavily-interacting services are deployed on different hardware resulting in a high number of remote calls. You need to analyze the frequency of interactions as well as data volume to structure deployment accordingly and avoid inefficent deployment configurations (Distributed Deployment Anti-pattern).

Most Popular Top 10s of 2009 - Lifehacker Top 10 - Lifehacker

Java Concurrency: Allocation Instrumenter for Java

IntelliJ IDEA 9 Released - Community edition comparison matrix

How To Be A Successful Developer | Javalobby

How To Be A Successful Developer | Javalobby
* Always strive to improve yourself and learn more.
* Share information freely with others — be generous.
* Focus on developing good working relationships with your coworkers, both technical staff and others.
* Effective communication, both written and spoken is crucial.
* Get involved in open source.
* Be precise.
* Deliver on commitments, or if you need to renegotiate your commitments.
* In everything that you do, do it with integrity.

Almost none of these have anything to do with knowledge of technology. I believe that social aspects have far more impact on success than anything else. Of course being knowledgeable helps too, however what's more important than knowing a specific technology is being able to pick up the knowledge that you need, when you need it.

A few things that I missed in my response because I take them for granted:

* Have passion for what you do.
* Strive for excellence.
* Avoid being self-righteous.

A (Brief) History of Object-Functional Programming

01 December 2009

DavMail facade over MS Exchange

DavMail POP/IMAP/SMTP/CalDav/LDAP Exchange Gateway
Ever wanted to get rid of Outlook ? DavMail is a POP/IMAP/SMTP/Caldav/LDAP exchange gateway allowing users to use any mail/calendar client (e.g. Thunderbird with Lightning or Apple iCal) with an Exchange server, even from the internet or behind a firewall through Outlook Web Access. DavMail now includes an LDAP gateway to Exchange global address book to allow recipient address completion in mail compoze window and full calendar support with attendees free/busy display.
DavMail Architecture

The main goal of DavMail is to provide standard compliant protocols in front of proprietary Exchange. This means LDAP for address book, SMTP to send messages, IMAP to browse messages on the server in any folder, POP to retrieve inbox messages only and Caldav for calendar support. Thus any standard compliant client can be used with Microsoft Exchange.

DavMail gateway is implemented in java and should run on any platform. Releases are tested on Windows, Linux (Ubuntu) and Mac OSX. Tray does not work on MacOS and is replaced with a full frame. Tested successfully with the Iphone (gateway running on a server)

29 November 2009

61 Free Apps We're Most Thankful For - Thanksgiving - Lifehacker

Handbrake for buring DVDs

HandBrake Updates to 0.9.4 with Over 1,000 Changes, 64-Bit Support - Handbrake - Lifehacker
Windows/Mac/Linux: If you ever have to rip a DVD to your desktop or convert video, you know how awesome open-source encoder HandBrake is

Dilbert Capitalism


Architecting a system need a wide knowledge of technologies, COTS, projects, standards.... | Java.net

Architecting a system need a wide knowledge of technologies, COTS, projects, standards.... | Java.net
When we start working on a new project as an architect we are dealing basically with a set of requirement which our architecture should be able to act as a foundation for the design and implementation of those requirements in form of a software system. to let the customer fulfill its requirements in a better and more efficient way.

Preparing the architecture for a software system means not only the architect to be familiar with the domain but also he should well aware of new technologies, frameworks, COTS, and standards available not only for the domain he is working on but also for the development platform which will realize the architecture into a working piece of software.

Knowing a platform is the foundation for knowing the related technologies, standards and popular frameworks around that platform, but to know the COTS, or open source projects around the same domain we are going to work on we will need to have some experience in implementing some software in the similar fields or we should have some research in the similar domain.

Having enough experience in doing architecture, knowing the domain, the platform and the free or commercial enablers form the necessaries in creating a good architecture but it is not enough because of the advancement in deployment models which one can select to develop and deploy the system independent of the domain itself. Having knowledge and experience with component models, modularity frameworks, integration standards and solutions, and deployment model form the next set of necessities in creating a good architecture.

Finally, I think being passionate in doing something play a big role in being successful in it. So, being passionate is the most notable requirement for an architect and without it, the architect will not be able to achieve his goal which is nothing but preparing a good architecture and realizing it to a good software.

Testing Frameworks for Java with Code Samples | Javalobby

Example of parameterised JUnit tests

Data-driven tests with JUnit 4 and Excel | Java.net
Useful example of using parameterised JUnit tests, not sure I'd want to maintain test results in an Excel spreadsheet though, why not just export it to csv

Why do you run Centos

Why do you run CentOS ?
I've been running Centos for many years now and have been very happy with it. Its basically a copy of Redhat Enterprise Linux, provided for free. The benefits of this are having a stable OS environment that matches that used in most enterprises.

The Abuse of Utility Creational Methods | Javalobby

The Abuse of Utility Creational Methods | Javalobby
Here’s a quote from Martin Fowler’s Refactoring book: “Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” I’m not saying that I’m a good (nor bad) programmer. All I am saying is that this phrase is in my mind whenever I write code. So, when I write code, I think about the person who will replace me some time in the future. A person that will probably be unfortunate to debug my code without me around to explain to him what’s going on.

Pingtest Assesses the Quality of Your Internet Connection - SpeedTest - Lifehacker

13 November 2009


Project of the day: YouDebug | Java.net
YouDebug is a debugger but it's not a debugger. Let's say you have the following program, which computes a String and then do substring.

public class SubStringTest {
public static void main(String[] args) {
String s = someLengthComputationOfString();

private static String someLengthComputationOfString() {

One of your user is reporting that it's failing with the StringIndexOutOfRangeException:

Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(String.java:1949)
at java.lang.String.substring(String.java:1916)
at SubStringTest.main(SubStringTest.java:7)

So you know s.substring(5) isn't working, but you want to know the value of the variable 's'.

To do this, let's set the breakpoint at line 7 and see what 's' is:

breakpoint("com.acme.SubStringTest",7) {
println "s="+s;

Now, you send this script to the user and ask him to run the program as following:

$ java -agentlib:jdwp=transport=dt_socket,server=y,address=5005 SubStringTest
Listening for transport dt_socket at address: 5005

And then the user executes YouDebug on a separate terminal. YouDebug attaches to your program, and your script will eventually produce the value of 's':

$ java -jar youdebug.jar -socket 5005 SubStringMonitor.ydb

09 November 2009

The (Positive) Effect of Adding New People to Project Teams

Xtreamer replaces my XBMC

Bought an Xtreamer media player to finally replace my trustworthy XBMC which has been running on a first gen Xbox for many years now.  The interface isn't very nice but it plays all formats I might ever want to play and can deal with HD formats including AVCHD from my camcorder and its only about £100.

29 October 2009

What would you call the enemy of innovation?

/dev/null : Weblog
I was interviewed recently on the topic of innovation as part of the Oracle Innovation Showcase. Reading back through it, I liked this question and answer:

Q: What would you call the enemy of innovation?

A: The first is complacency. A lot of people are satisfied with what they have. If you can convince yourself you're satisfied, you'll stop looking for a better solution. The other is an inability to listen and appreciate the complexity of a problem. Everyone wants to believe they have the answer before the question even gets asked. People don't take the time to listen and appreciate the individual complexity of each customer's problem and the nuances of their environments.

26 October 2009

Centos 5.4 released

Manuals/ReleaseNotes/CentOS5.4 - CentOS Wiki
yum clean all
yum update glibc\*
yum update yum\* rpm\* python\*
yum clean all
yum update
shutdown -r now

19 October 2009

DriverMax to to automatically download Windows drivers

Downloaded DriverMax today and it found quite a few drivers on my laptop that needed updating even though I have Windows Update switched on.
DriverMax is a new tool that allows you to download the latest driver updates for your computer. No more searching for rare drivers on discs or on the web or inserting one installation CD after the other. Just create a free account, log in, and start downloading the updates that you need.

18 October 2009

Google Collections MapMaker

Google Collections MapMaker - referencing http://www.javaperformancetuning.com/news/news105.shtml
Google commons ConcurrentMap builder, providing any combination of: soft or weak keys, soft or weak values, timed expiration, and on-demand computation of values

IntelliJ IDEA Community Edition

IntelliJ IDEA Open Sourced | JetBrains IntelliJ IDEA Blog
Starting with the upcoming version 9.0, IntelliJ IDEA will be offered in two editions: Community Edition and Ultimate Edition. The Community Edition focuses on Java SE technologies, Groovy and Scala development. It’s free of charge and open-sourced under the Apache 2.0 license. The Ultimate edition with full Java EE technology stack remains our standard commercial offering. See the feature comparison matrix for the differences.

OnlineOCR Converts Your Scanned Documents to Editable Text - Conversion - Lifehacker

OnlineOCR Converts Your Scanned Documents to Editable Text - Conversion - Lifehacker
You can upload documents in a variety of formats like PDF, TIFF, JPG, and other image files as well as a ZIP of your document.

Selecting the Best Java Collection Class for Your Application

Selecting the Best Java Collection Class for Your Application - referenced from http://www.javaperformancetuning.com/news/newtips104.shtml
* With CopyOnWriteArrayList all operations that change the contents of a CopyOnWriteArrayList collection cause the underlying array to be replaced with a copy of itself before the contents of the array are changed. Any active iterators will continue to see the unmodified array, so there is no need for locks.
* HashSet is faster than TreeSet, so only use the TreeSet preferentially when you need elements to remain in sorted order.
* HashSet is faster than LinkedHashSet so only use the LinkedHashSet preferentially when you need elements to remain in insertion order.
* ConcurrentSkipListSet keeps it's elements in sorted order and is thread-safe and usually preferable to a synchronized wrapped set.
* With CopyOnWriteArraySet all operations that change the contents of a CopyOnWriteArraySet collection cause the underlying array to be replaced with a copy of itself before the contents of the array are changed. Any active iterators will continue to see the unmodified array, so there is no need for locks.
* PriorityQueue add and remove methods take time that is proportionate to the number of objects in the queue. Queues based on the PriorityQueue class do not block. PriorityBlockingQueue is similar in performance but does block.
* For situations when multiple threads are waiting to perform operations on a queue, the behavior of an ArrayBlockingQueue object will be more consistent and predictable than for a LinkedBlockingQueue.
* ConcurrentLinkedQueue is thread-safe and usually preferable to a synchronized wrapped queue.
* DelayQueue is a specialized blocking queue that consults the objects it contains as to when they can be removed from the queue.
* A SynchronousQueue object is always empty; the put method blocks unless another thread is waiting for the object's take method to return.
* If you are not interested in how objects will be organized in a collection, then the only other consideration is performance. In that case, use the ArrayList class. It is fast and makes efficient use of memory.

Continuous Performance Testing

Continuous Performance Testing - referenced from http://www.javaperformancetuning.com/news/newtips104.shtml
* Industry has learned the hard way that holding a big testfest at the end of the development phase of a project is a great way to ensure failure - this applies equally to performance testing.
* Avoiding premature optimizations has been confused by some as avoiding performance aspects of a project altogether, only focusing on it when it becomes necessary - this is wrong. You need to consider performance at all stages, or you are setting yourself up for failure or much more work later on.
* During requirements gathering phase you need to collect users expectations for performance.
* During architecture defining, you should use benchmarking to determine that your decisions will not prevent your architecture from meeting your performance goals.
* During development you should integrate Continuous Performance Testing to your system so that you can gain an understanding of how your system is progressing.
* Performance testing at unit test level (i.e. method level) isn't completely wrong, but is a level of granularity that isn't advisable.
* Performance testing during development should be done at an appropriate level of granularity: at the component level, or possibly at the class level for some types of classes.

Dilbert: Deadlines


Causes of Java PermGen Memory Leaks « Amped and Wired

8 Monkeys

8 Monkeys | Gleez
Put eight monkeys in a room. In the middle of the room is a ladder, leading to a bunch of bananas hanging from a hook on the ceiling.

Each time a monkey tries to climb the ladder, all the monkeys are sprayed with ice water, which makes them miserable.

Soon enough, whenever a monkey attempts to climb the ladder, all of the other monkeys, not wanting to be sprayed, set upon him and beat him up.

Soon, none of the eight monkeys ever attempts to climb the ladder.

One of the original monkeys is then removed, and a new monkey is put in the room. Seeing the bananas and the ladder, he wonders why none of the other monkeys are doing the obvious. But undaunted, he immediately begins to climb the ladder.
All the other monkeys fall upon him and beat him silly. He has no idea why.

However, he no longer attempts to climb the ladder. A second original monkey is removed and replaced. The newcomer again attempts to climb the ladder, but all the other monkeys hammer the crap out of him. This includes the previous new monkey, who, grateful that he's not on the receiving end this time, participates in
the beating because all the other monkeys are doing it. However, he has no idea why he's attacking the new monkey.
One by one, all the original monkeys are replaced.

Eight new monkeys are now in the room. None of them have ever been sprayed by ice water. None of them attempt to climb the ladder. All of them will enthusiastically beat up any new monkey who tries, without having any idea why.

This is how any company's policies get Established.

Interns and HR

Javva The Hutt July 2009 - Java The Hutt
I had an intern last week. No, not for my lunch, actually doing stuff for me. The company decided to fork out all of a whole unused seat and workstation, an actual security pass, and an actual system account for this intern to do a week of work experience. No pay for the poor blighter, not even expenses to cover transport costs. It was just one week, I was amazed that he actually got onto the system, whenever I've seen new employees come in, it's usually taken the best part of a week just to get them setup. But our one-week intern was properly on the system on day one. Email and internet access and all he needed! I think we have a rogue monkey.

A rogue monkey? Well, in every big organisation I've ever been in, it takes a week to get people setup. It shouldn't, realistically after the company sets up a few employees, someone should say, 'golly, these are all the things you need to do to setup a new employee, and look 99% of it can be automated, so here, I've done that, now you just need these half a dozen pieces of information entered on this screen, and press the button and presto, it's all done'. But that's not company policy, so anyone that does that gets beaten up by the other monkeys. I am, of course, referring to the parable of the eight monkeys (if you've not heard of it, web search 'the eight monkeys', it's worth reading).

So I can only assume that given our intern was up and running on day one, nay not even day one, but actually immediately after Monday morning induction, that somehow a rogue monkey has slipped in and got the bananas. Congratulations unknown rogue monkey, I am sorry to say that your career here is likely to be short. And will involve being beaten up by the other monkeys when they find you got the bananas. That's life.

Anyway, knowing that I had the use of an untrained newbie for a week, I thought to myself 'what on earth can I give him to do to keep him out of my hair and busy'. After all, interns rarely provide anything useful even when they do the three month internship, what could he do in a week? So I looked through the very bottom of my retired to-do lists, you know, the ones you've given up even bothering to list since you are only going to get to them on your fifth life. And I gave him those to do. Comparative analysis of performance products, resolutions to simple issues that need no knowledge, just a bit of time spent on the internet, that sort of thing. And just in case he was fast (you never know), I gave him enough for two weeks worth of work.

By the end of Wednesday he had finished the lot. And quality work too. He found out stuff that would make us more efficient, that would improve the quality of our work. I wanted to hire him, but HR did their HR job of saying no, we can't have efficiency. Maybe I should just start my own company. At least now I know who'd be employee number 2.

05 October 2009

Monitoring direct buffers : Alan Bateman

Discover and Promote What Works

Discover and Promote What Works
The thrill of greenfield development is enticing. It's the mark of a young developer to want to invent (for example) their own web framework the moment they discover something they don't like in existing web frameworks.

It doesn't sound at first to be as exciting, but finding something that works and promoting that gives you the thrill of success (rather than just the thrill of invention). I don't know about you, but I'm finding success to be much more attractive than exploring for the sake of exploration. Indeed, I'd rather not have the glory of raw invention; I'll happily trade that in for success, and I'll happily credit the person who made that success happen. Stand on the shoulders of giants, and all that.

Seth Godin says this might be the most important concept in Tribes. His example is even more compelling: societal change. Rather than going into a village with some Westernized idea of how to solve a health problem, you "find the mom with the healthy kids ... then help others in the village notice what she was doing."

This is a subtle but compelling mental shift when seeking answers. One of the biggest aspects is that it isn't about coming up with some argument to convince people that something will work. There is no argument about whether it will work -- you can start doing something because you know it works, and later you can argue about what exactly is producing the success, presumably in an attempt to refine the process (or you can refine by discovering other things that work).

Note that this is a "natural selection" process. You allow the equivalent of genetic mutation to produce lots of solutions and choose the one(s) that work rather than assuming you can use logic and first principles to invent a solution (not that the latter approach is bad per se; rather we are over-fixated on it as the only way to solve problems).

26 September 2009

Two Kinds of Management by Bruce Eckel

Two Kinds of Management Bruce Eckel
Summary: Guidance & inspiration vs. directing & controlling.

Virtually all the management I've had to do involves people working independently, mostly because I can't bring myself to micromanage and control another person. The downside of this is that if someone isn't in touch with themselves, they can represent things the way they want them to be, rather than how they are. And when they are left to their own devices, they behave differently than what they represented.

I am then put in a position where I have to either (1) take over and start micromanaging/controlling or (2) give up whatever investment I have with that person and try to find someone else. Because of my nature, option 1 eventually fails and I end up with option 2; in recent times I've been trying to understand this well enough that I don't try to fix things and just realize that, for whatever reason, it hasn't worked out so I go as soon as possible (as soon as I can become aware) to option 2. This also follows the maxim "never consider sunk costs when making decisions."

I think this approach is also good for the person in question. As long as you are pretending that things are working OK, you are supporting that person in their illusions. What they actually need is help in moving on, so that they can discover whatever they are really good at.

If you have a company structure where managers are responsible for directing and controlling employee activities, those managers cannot do creative work; they are effectively lost to the needs of the corporate hierarchy. This seems like an unacceptable loss to me, both for the company and for the managers. (In Here Comes Everybody, Clay Shirky makes the case that electronic communication is disrupting the need for traditional corporate hierarchy).

In Outliers, Malcolm Gladwell shows how human-caused disasters never come from big, obvious events or mistakes, but are always an accumulation of small, apparently manageable errors and problems (usually about seven of these before everything goes pear-shaped). What I've been noticing lately is that relationship problems -- typically business relationships but I am starting to think also personal ones -- follow this pattern. People rarely do one big thing that makes it easy to see there's something wrong. In those cases, you don't usually have to wait for the big thing to happen, you see it coming long before (and it's often obvious enough that you don't hire the person in the first place). It's the small things that slip beneath your radar, where you say, "oh, that's just a little thing, I can let it slide" or "we can compensate for that" to the point where you have a bunch of small things causing problems, each of which you've kind of passed through and decided it wasn't a problem. The real failures, of course, are the ones that get past the Maginot Line of our attention.

I don't like command-and-control management, and I don't want to do it. To me it indicates a failure of team-building, and it wastes my time. The kind of management I want to do is the fun kind, where I can draw on my experience to help people do a better job, to inspire and guide. I've been fortunate that most of my consulting engagements have been like this. (Also note the similarity to Martin Fowler's discussion of software development attitudes).

What's your opinion of 'the Cloud'? the emperor's new clothes

Poll Result: 'the Cloud' Is Not an Earth-Shattering Development | Java.net
What's your opinion of 'the Cloud'?

* 13% (45 votes) - It's the future: gradually all apps will move to the Cloud.
* 6% (21 votes) - It's great. I'm using it already.
* 26% (91 votes) - It's an interesting development. I'll wait and see what comes of it.
* 19% (66 votes) - It's a passing phase, like so many other things we've seen.
* 30% (105 votes) - It's the emperor's new clothes: the Cloud is just a server, what's so new about it?
* 5% (19 votes) - I don't know; other

Continuous Deployment

Wow, here's an idea I never heard off or even contemplated, bit scary:
The most controversial practice that Marty promotes is Continuous Deployment. This is the automated deployment of code to production. It includes automated testing and continuous integration, simple deployment/rollback scripts, a successful CI build triggers deployment, and there's real-time alerts in production. When shit goes wrong, you should use the "five whys" to perform root cause analysis. Marty admits that this is only a good idea when there's a high-level of trust in your development team and lots of tests to prove nothing is broken.

The benefits of continuous deployment is there's a lower story cycle time, you eliminate waste in deploying code, you deliver features/bugs fixes faster and you find integration issues quicker and in isolation. It's also a great way to promote not checking in shitty code.

The skeptics think this is a bad idea because 1) it's scary, 2) they believe it causes lower quality and 3) it causes more issues in production. The good news is you can still control production deployments with your source control system (e.g. branches and such). More than anything, it forces you to have a high quality continuous integration system that acts as the gatekeeper for what goes to production.

TCP tuning

24 September 2009

Project Shoal announces the final release of Shoal 1.1 FCS

Project Shoal announces the final release of Shoal 1.1 FCS
Shoal 1.1, the latest release of the Java based dynamic clustering framework, contains a number of features and bug fixes that significantly improve clustering reliability, and scalability for your applications.

Kirk Pepperdine on Performance Tuning and Cloud Computing | Java.net

Kirk Pepperdine on Performance Tuning and Cloud Computing | Java.net
Early in the interview, Janice asked Kirk: "What misconceptions do you encounter among developers about performance tuning?"

Kirk's response wasn't what I would have expected. He answered:

Here's a major one: All developers believe they are good at performance tuning. They might be very good at writing performance code, and they might be very good at coding, but generally I find that most developers are not very good at performance tuning.

How can this be?

When developers are put in situations where they are asked to performance tune, they typically look at code and take some action. But they invariably forget the dynamics of the system. If you don't include the dynamics of the system when you performance tune or if you think you understand the dynamics of the system and guess wildly wrong - which is quite often the case - you end up doing the wrong thing, which frequently happens.

Kirk notes that software testers tend to be more successful when it comes to performance tuning than most developers, because they are more accustomed to working within a defined process.

IntelliJ IDEA and JRebel: Better Together | JetBrains IntelliJ IDEA Blog

When long is not long enough | Java.net

Some Java Concurrency Tips | Java.net

Benchmarking - thrift-protobuf-compare

20 September 2009

Tervela bags $18m for go faster messaging appliances • The Register

Tervela bags $18m for go faster messaging appliances • The Register
In June this year Tervela launched the third generation of its messaging appliance, theTMX-500 Message Switch. This uses custom ASIC and FPGA chips to speed up the processing of Java Message Service (JMS) messages as they bounce around n-tier Java applications, loosely gluing them together so they can do transactions like the ones at the heart of financial systems on display at HPC on Wall Street.

The TX-500 comes in a 2U form factor (and is basically three motherboards and some fans) that can have sixteen Gigabit Ethernet links or four 10 Gigabit Ethernet links that allows anywhere from 4 to 64 million JMS messages per second to be processed, all with keeping the message latency under 10 microseconds.

12 September 2009

Developing with real-time Java, Part 2: Improve service quality

Eliminate Architecture | Javalobby

Eliminate Architecture | Javalobby
Therefore, I’ll conclude that the goal of software architecture must be to eliminate the impact and cost of change, thereby eliminating architectural significance. And if we can do that, we have eliminated the architecture.

Creating Objects Without Calling Constructors

[JavaSpecialists 175] - Creating Objects Without Calling Constructors
import sun.reflect.ReflectionFactory;
import java.lang.reflect.Constructor;

public class SilentObjectCreator {
public static T create(Class clazz) {
return create(clazz, Object.class);

public static T create(Class clazz, Class parent) {
try {
ReflectionFactory rf = getReflectionFactory();
Constructor objDef = parent.getDeclaredConstructor();
Constructor intConstr = rf.newConstructorForSerialization(clazz, objDef);
return clazz.cast(intConstr.newInstance());
} catch (RuntimeException e) {
throw e;
} catch (Exception e) {
throw new IllegalStateException("Cannot create object", e);

07 September 2009



Jon Masamitsu's Weblog

Our low-pause collector (UseConcMarkSweepGC) which we are usually careful to call our mostly concurrent collector has several phases, two of which are stop-the-world (STW) phases.

# STW initial mark
# Concurrent marking
# Concurrent precleaning
# STW remark
# Concurrent sweeping
# Concurrent reset

The first STW pause is used to find all the references to objects in the application (i.e., object references on thread stacks and in registers). After this first STW pause is the concurrent marking phase during which the application threads runs while GC is doing additional marking to determine the liveness of objects. After the concurrent marking phase there is a concurrent preclean phase (described more below) and then the second STW pause which is called the remark phase. The remark phase is a catch-up phase in which the GC figures out all the changes that the application threads have made during the previous concurrent phases. The remark phase is the longer of these two pauses. It is also typically the longest of any of the STW pauses (including the minor collection pauses). Because it is typically the longest pause we like to use parallelism where ever we can in the remark phase.

Part of the work in the remark phase involves rescanning objects that have been changed by an application thread (i.e., looking at the object A to see if A has been changed by the application thread so that A now references another object B and B was not previously marked as live). This includes objects in the young generation and here we come to the point of these ramblings. Rescanning the young generation in parallel requires that we divide the young generation into chunks so that we can give chunks out to the parallel GC threads doing the rescanning. A chunk needs to begin on the start of an object and in general we don't have a fast way to find the starts of objects in the young generation.

Given an arbitrary location in the young generation we are likely in the middle of an object, don't know what kind of object it is, and don't know how far we are from the start of the object. We know that the first object in the young generation starts at the beginning of the young generation and so we could start at the beginning and walk from object to object to do the chunking but that would be expensive. Instead we piggy-back the chunking of the young generation on another concurrent phase, the precleaning phase.

During the concurrent marking phase the applications threads are running and changing objects so that we don't have an exact picture of what's alive and what's not. We ultimately fix this up in the remark phase as described above (the object-A-gets-changed-to-point-to-object-B example). But we would like to do as much of the collection as we can concurrently so we have the concurrent precleaning phase. The precleaning phase does work similar to parts of the remark phase but does it concurrently. The details are not needed for this story so let me just say that there is a concurrent precleaning phase. During the latter part of the concurrent precleaning phase the the young generation "top" (the next location to be allocated in the young generation and so at an object start) is sampled at likely intervals and is saved as the start of a chunk. "Likely intervals" just means that we want to create chunks that are not too small and not too large so as to get good load balancing during the parallel remark.

Ok, so here's the punch line for all this. When we're doing the precleaning we do the sampling of the young generation top for a fixed amount of time before starting the remark. That fixed amount of time is CMSMaxAbortablePrecleanTime and its default value is 5 seconds. The best situation is to have a minor collection happen during the sampling. When that happens the sampling is done over the entire region in the young generation from its start to its final top. If a minor collection is not done during that 5 seconds then the region below the first sample is 1 chunk and it might be the majority of the young generation. Such a chunking doesn't spread the work out evenly to the GC threads so reduces the effective parallelism.

If the time between your minor collections is greater than 5 seconds and you're using parallel remark with the low-pause collector (which you are by default), you might not be getting parallel remarking after all. A symptom of this problem is significant variations in your remark pauses. This is not the only cause of variation in remark pauses but take a look at the times between your minor collections and if they are, say, greater than 3-4 seconds, you might need to up CMSMaxAbortablePrecleanTime so that you get a minor collection during the sampling.

And finally, why not just have the remark phase wait for a minor collection so that we get effective chunking? Waiting is often a bad thing to do. While waiting the application is running and changing objects and allocating new objects. The former makes more work for the remark phase when it happens and the latter could cause an out-of-memory before the GC can finish the collection. There is an option CMSScavengeBeforeRemark which is off by default. If turned on, it will cause a minor collection to occur just before the remark. That's good because it will reduce the remark pause. That's bad because there is a minor collection pause followed immediately by the remark pause which looks like 1 big fat pause.l

Understanding Concurrent Mark Sweep Garbage Collector Logs

03 September 2009

Nuba - Free conference calling

Nuba, free conference calling:
* Up to 100 participants
* Audio Recording
* MP3 audio Download & Podcasts
* WEB & VIDEO Conferencing
* Available 24/7, RESERVATION-LESS
* International access
* No Invoicing, no billing, no hidden cost
* New! Outlook plugin

Passing parameters to JUnit tests

JUnit: A Little Beyond @Test, @Before, @After

Looks like JUnit 4 has a feature that I find incredibiliy useful in TestNG.  Its the ability to invoke a single test method many times with different arguments.  This is very useful to invoke the test method passing in input parameters and expected results.

multiple ssh private keys

multiple ssh private keys
In quite a few situations its preferred to have ssh keys dedicated for a service or a specific role. Eg. a key to use for home / fun stuff and another one to use for Work things, and another one for Version Control access etc. Creating the keys is simple, just use

ssh-keygen -t rsa -f ~/.ssh/id_rsa.work

... ssh config lets you get down to a much finer level of control on keys and other per-connection setups ... ~/.ssh/config looks like this:

Host *.home.lan
IdentityFile ~/.ssh/id_dsa.home
User kbsingh

Host *.vpn
IdentityFile ~/.ssh/id_rsa.work
User karanbir
Port 44787

Host *.d0.karan.org
IdentityFile ~/.ssh/id_rsa.d0
User admin
Port 21871

Ofcourse, if I am connecting to a remote host that does not match any of these selections, ssh will default back to checking for and using the 'usual' key, ~/.ssh/id_dsa or ~/.ssh/id_rsa

Staying Current: A Software Developer's Responsibility | Javalobby

Converting log files into csv

Here is how I converted a log file contains lines like

type:Trade, clientOrderId:20634240922, marketOrderId:bsqj0, marketEventId:ccpfv, clientEventId:ccpfv, status:PARTIAL_FILL, orderParameters:[market:RPR, party:MYCO, instrument:USDCAD, side:BUY, quantity:1000000.0, minimumQuantity:10000.0, maximumShowQuantity:1000000.0, price:1.0975700000000002, type:LIMIT, quoteType:INDICATIVE, timeInForce:GTC, properties:null], transactDateTime:2009-08-27T00:00:15.712Z, receivedDateTime:2009-08-27T00:00:15.712Z, openQuantity:980000.0, executedQuantity:20000.0, averageExecutedPrice:1.0975700000000002, marketTradeId:c4v, clientTradeId:null, tradeQuantity:20000.0, tradePrice:1.0975700000000002, counterparty:SOMECO, settlementDateTime:2009-08-28T00:00:00.000Z, taker:false, comments:null

into a csv file that looked more like:

c4v, 20634240922, 2009-08-27T00:00:15.712Z, MYCO, USDCAD, BUY, PARTIAL_FILL, 20000.0, 1.0975700000000002,

tail -F mylogfile.log | sed 's/.*, clientOrderId:\(\S*\),.*, status:\(\S*\),.*, party:\(\S*\), instrument:\(\S*\), side:\(\S*\),.*, transactDateTime:\(\S*\),.*, marketTradeId:\(\S*\),.*, tradeQuantity:\(\S*\), tradePrice:\(\S*\).*/\7, \1, \6, \3, \4, \5, \2, \8, \9/'

16 August 2009

Backing up files to a remote site

Recently I changed ISP and lost access to a ftp server that I used to upload critical files, from my server every night, via an automated job. I have now replaced this with some free space at box.net that I can access via webdav. I use davfs2 to mount this space as a normal directory on my linux machine and can then simply copy files to it. The setup process on a centos machine is:

yum install fuse dkms-fuse davfs2
modprobe fuse
edit /etc/davfs2/secrets and insert "http://www.box.net/dav boxnetusername boxnetpassword"
edit /etc/davfs2/davfs2.conf and insert "use_locks 0"
mkdir /media/box.net
mount -t davfs http://www.box.net/dav /media/box.net

09 August 2009

Remote Mirroring Using nc and dd | Linux Journal

Remote Mirroring Using nc and dd | Linux Journal
You can use the dd and nc commands for exact disk
mirroring from one server to another.
The following commands send data from Server1 to Server2:

Server2# nc -l 12345 | dd of=/dev/sdb
Server1# dd if=/dev/sda | nc server2 12345

Make sure that you issue Server2's command first
so that it's listening on port 12345 when Server1 starts sending its data.

Unless you're sure that the disk is not being modified,
it's better to boot Server1 from a RescueCD or LiveCD to do the copy.

Spotify Is the Best Desktop Music Player We've Ever Used - Lifehacker

Modularity Patterns

Modularity Patterns does a great job at describing the tension between resuse and ease-of-use.

06 August 2009

Maven Versions

Versions is a maven plugin that offers useful functions, such as listing newer versions of all your dependencies.
  • versions:display-dependency-updates scans a project's dependencies and produces a report of those dependencies which have newer versions available.
  • versions:display-plugin-updates scans a project's plugins and produces a report of those plugins which have newer versions available.
  • versions:update-parent updates the parent section of a project so that it references the newest available version. For example, if you use a corporate root POM, this goal can be helpful if you need to ensure you are using the latest version of the corporate root POM.
  • versions:update-properties updates properties defined in a project so that they correspond to the latest available version of specific dependencies. This can be useful if a suite of dependencies must all be locked to one version.
  • versions:update-child-modules updates the parent section of the child modules of a project so the version matches the version of the current project. For example, if you have an aggregator pom that is also the parent for the projects that it aggregates and the children and parent versions get out of sync, this mojo can help fix the versions of the child modules. (Note you may need to invoke Maven with the -N option in order to run this goal if your project is broken so badly that it cannot build because of the version mis-match).
  • versions:lock-snapshots searches the pom for all -SNAPSHOT versions and replaces them with the current timestamp version of that -SNAPSHOT, e.g. -20090327.172306-4
  • versions:unlock-snapshots searches the pom for all timestamp locked snapshot versions and replaces them with -SNAPSHOT.
  • versions:resolve-ranges finds dependencies using version ranges and resolves the range to the specific version being used.
  • versions:use-releases searches the pom for all -SNAPSHOT versions which have been released and replaces them with the corresponding release version.
  • versions:use-next-releases searches the pom for all non-SNAPSHOT versions which have been a newer release and replaces them with the next release version.
  • versions:use-latest-releases searches the pom for all non-SNAPSHOT versions which have been a newer release and replaces them with the latest release version.
  • versions:use-next-versions searches the pom for all versions which have been a newer version and replaces them with the next version.
  • versions:use-latest-versions searches the pom for all versions which have been a newer version and replaces them with the latest version.
  • versions:commit removes the pom.xml.versionsBackup files. Forms one half of the built-in "Poor Man's SCM".
  • versions:revert restores the pom.xml files from the pom.xml.versionsBackup files. Forms one half of the built-in "Poor Man's SCM".

05 August 2009

Guide - GUI Development Environment

GUIDE looks quite impressive in its screencast. I've never used a GUI builder, well not since Visual Basic, many years ago, so I can't really compare. However, GUIDE has been built by friends of mine, so I'm sure it's good!

"We're looking for developers to take GUIDE out for a spin and help us shape it up for general release. You can use GUIDE unregistered with some feature restrictions, or fill in your email address and we'll send you a licence valid for 30 days. You can request a new licence as often you need. If you have any problems with this form please contact us at betasignup@mindsilver.com."

03 August 2009


Overview of MultithreadedTC - Dept. of Computer Science, UMD
MultithreadedTC is different. It supports test cases that exercise a specific interleaving of threads. This is motivated by the principle that concurrent applications should be built using small concurrent abstractions such as bounded buffers, semaphores and latches. Separating the concurrency logic from the rest of the application logic in this way makes it easier to understand and test concurrent applications. Since these abstractions are small, it should be possible to deterministically test every (significant) interleaving in separate tests.

Performance Comparison of Apache MINA and JBoss Netty

JBoss.org: Netty - Performance Comparison of Apache MINA and JBoss Netty
"One of my big complaints with Apache MINA is the high latency that’s incurred when sending data. MINA uses a set of I/O threads to handle reads and writes. This is typical of many non-blocking I/O frameworks.

Netty is much more clever than MINA. In Netty, when you make a call to send data and the send queue is empty, Netty will just send the data. We are using non-blocking I/O so the call will still be asynchronous. If the send queue is not empty, Netty will queue up the data to be sent in much the same way that MINA does sends. The Netty approach is much faster."

Obba: A Java Object Handler for Excel and OpenOffice.

Obba: A Java Object Handler for Excel and OpenOffice.

Obba provides a bridge from spreadsheets to Java classes. With Obba, you can easily build spreadsheet GUIs (Excel or OpenOffice) to Java code. Its main features are:

  • Loading of arbitrary jar or class files at runtime through an spreadsheet function.
  • Instantiation of Java objects, storing the object reference under a given object label.
  • Invocation of methods on objects referenced by their object handle, storing the handle to the result under a given object label.
  • Asynchronous method invocation and tools for synchronization, turning your spreadsheet into a multi-threaded calculation tool.
  • Allows arbitrary number of arguments for constructors or methods (avoids the limitation of the number of arguments for Excel worksheet functions).
  • Serialization and de-serialization (save Serializable objects to a file, restore them any time later).
  • All this though spreadsheet functions, without any additional line of code (no VBA needed, no additional Java code needed).

Fabrizio Giudici's Blog: Switched to Mercurial

Fabrizio Giudici's Blog: Switched to Mercurial
"asynchronous mode support. This might not be the most important thing, but I love the fact that Mercurial commits locally, so you can work without being connected. You should know that I love to be disconnected, if possible, and other times I can't be connected if I'm traveling (and with the incoming August I'm going to move for a month to the countryside). The requirement of synchronous operations by a centralized SCM such as Subversion is really annoying if you like the "commit often" approach, like me: if you are not connected, either you can't work, or you are forced to change commit style, which is no good. If you have a mobile connection, which is often slow and intermittent, "commit often" works but you loose lots of time while waiting for commits to be completed. With Mercurial, commits are immediate, so you can work as usual. Then, a few times per day, you do a "push" to synchronize the primary repository - this might take a few time, but you can run it while you're doing something else (another job, you're eating or having a bath)."

Dilbert Budget Planning

Making the Good Programmer ... Better | Javalobby

Got to buy some of these Belkin Gigabit Powerline HD plugs

13 July 2009

Export your contacts from your Google Apps account automatically as a vcard file

10/10/2009: edited as Google now requires a field called GALX to be posted too.
08/01/2011: edited as Google now requires a field called tok to be posted too.
16/05/2012: this solution nolonger works, looks like google have modified their authentication mechanism.

To export your contacts from your Google Apps account automatically as a vcard file:

google apps domain: acme.com
user: doej
password: mypassword

wget "https://www.google.com/a/acme.com/ServiceLogin" --no-check-certificate --save-cookies="/tmp/gcookie" --keep-session-cookies --output-document=/dev/null

GALX=`grep "GALX" /tmp/gcookie | cut -f 7`

wget "https://www.google.com/a/acme.com/LoginAction2" --post-data="service=contacts&Email=doej&Passwd=mypassword&GALX=${GALX}&PersistentCookie=yes&continue=http://www.google.com/contacts/a/acme.com" --no-check-certificate --load-cookies="/tmp/gcookie" --save-cookies="/tmp/gcookie" --keep-session-cookies --output-document=/dev/null

wget "https://www.google.com/contacts/a/c/acme.com/ui/ContactManager?titleBar=true&hl=en&dc=true#" --no-check-certificate --load-cookies="/tmp/gcookie" --save-cookies="/tmp/gcookie" --keep-session-cookies --output-document=/tmp/gtoken

TOKEN=`grep "var tok" /tmp/gtoken | cut -c 16-72`

wget "http://www.google.com/contacts/a/c/acme.com/data/export?exportType=GROUP&groupToExport=%5EMine&out=VCARD" --no-check-certificate --load-cookies="/tmp/gcookie" --output-document="/tmp/addressbook.vcf"

Note: In the first wget, by continuing to mail.google.com and specifying keep-session-cookies, the cookie file, /tmp/gcookie, will contain entries for the mail.google.com. This is required for the second wget.

12 July 2009

Alex Miller - Hibernate query cache considered harmful?

Using Netcat to test ports

Yesterday I needed to quickly test if two machines could communicate over tcp/ip across a firewall. You can use netcat to quickly open a service that is listening on a port.

To listen on port 8888 on host server1:

server1$ nc -l 8888

On the other end (host server2), connect to server1 with the following:

server2$ nc server1 8888

Anything you type on server1 or server2 is now transferred to the other host and you have successfully tested connectivity.


Reuse: Is the Dream Dead? | Javalobby

Reuse: Is the Dream Dead? | Javalobby An interesting analysis, the conclusion is copied below:
"The challenge we run into when attempting to create a highly reusable component is to manage the tension between reusability and useability. In our example above, breaking out the coarse-grained component into fine-grained components makes it more difficult to use each of the resulting fine-grained components. Likewise, creating a lightweight components makes using the component more difficult since the component must be configured each time the component is used.

Fine-grained components have more component dependencies and lightweight components have more context dependencies. Each makes a component more reusable, but also more difficult to use. The key is to strike a balance, and that is a topic for another day not too far away."

08 July 2009

How Hotspot Decides to Clear SoftReferences

How Hotspot Decides to Clear SoftReferences, interesting quotes too:
...IMHO using Softreferences in an application server is evil, because of this very coarse grained way to configure their behaviour. Note also that the mechanism implies that Softreferences will (in low memory situations) only be reclaimed after more than one full GC.
...This is precisely the problem that my colleagues encountered. They really want an LRU cache for what they are trying to do, and they tried to achieve it cheaply using SoftReferences.

06 July 2009

James Strachan gives Scala the thumbs up

James Strachan's Blog: Scala as the long term replacement for java/javac?
I can honestly say if someone had shown me the Programming Scala book by by Martin Odersky, Lex Spoon & Bill Venners back in 2003 I'd probably have never created Groovy.

Database Shards

Shard (database architecture) - Wikipedia, the free encyclopedia
Horizontal partitioning is a design principle whereby rows of a database table are held separately, rather than splitting by columns (as for normalization). Each partition forms part of a shard, which may in turn be located on a separate database server or physical location. The advantage is the number of rows in each table is reduced (this reduces index size, thus improves search performance). If the sharding is based on some real-world aspect of the data (e.g. European customers vs. American customers) then it may be possible to infer the appropriate shard membership easily and automatically, and query only the relevant shard. [1]

Sharding is in practice far more difficult than this. Although it has been done for a long time by hand-coding (especially where rows have an obvious grouping, as per the example above), this is often inflexible. There is a desire to support sharding automatically, both in terms of adding code support for it, and for identifying candidates to be sharded separately.

Where distributed computing is used to separate load between multiple servers (either for performance or reliability reasons) a shard approach may also be useful here.

Lifehacker - Clean Your Keyboard with a Hair Dryer

04 July 2009

Caught Between Two IDEs

In "Caught Between Two IDEs" the author seems to highlight some of the problems that I also share. IntelliJ IDEA has great features and has been responsible for high levels of productivity in software development. However, operating on a project that contains a large amount of source code it seems to be incredibily slow and prone to being unresponsive for long periods. I've tried optimising its memory and gc settings but it hasn't helped much. Maybe it is time to give Eclipse a try... I'll hold out a little bit longer but am nearly there.

01 July 2009

Lifehacker - Best of the Best: Hive Five Winners, March through June 2009

StackProbe - a new Java profiler

StackProbe - a new Java profiler
StackProbe can connect to local or remote applications through JMX without causing too much overhead, while giving accurate results. It periodically takes snapshots of threads stacktraces and analyzes them in real time. The main advantages of this technique are:

* You can control overhead by dynamically adjusting the sampling rate.
* There is no risk of breaking the production system - no code is ever modified.
* The slight slowdown caused by sampling is distributed evenly between all methods in the code, so the relative values reported by the profiler are very accurate.
* StackProbe is able to estimate the statistical error of reported results. This is actually the first profiler to do this.
* The profiler is ready to work just after establishing a JMX connection.
* No special agents are required to be installed on the remote side.
* You can observe first results very early, just when the first few samples arrive.

StackProbe is free for open-source projects. More information and an online demo can be found at http://www.stackprobe.com/.

30 June 2009

Painting the Bike Shed

Painting the Bike Shed - ACM Queue
What you witnessed was, unfortunately, a typical reaction to simple changes in a codebase. Have you ever noticed that when someone checks in some complex and, oftentimes, horrific piece of code, the check-in is greeted with an almost deafening silence? That is often because the people who should be reviewing such code just don’t have the time. Unfortunately, lots of people do have the time to review a 10- or 50-line change, and since they feel guilty for not having checked the bigger pieces of code, they nitpick the small pieces.

The explanation for why this occurs was first given by C. Northcote Parkinson, who wrote a book about management (Parkinson’s Law and Other Studies in Administration, Ballantine Books, 1969). He stated that if you were building something complex, then few people would argue with you because few people could understand what you were doing. If you were building something simple—say, a bike shed—which most anyone could build, then everyone would have an opinion. Unfortunately, as you learned, it’s not enough just to have an opinion, but most people feel they should express that opinion.

The engineer who wrote the original code clearly figured out that he was trapped in a pointless loop and decided to break out of it by telling people to paint the shed any color they liked. He or she was probably pretty sure that no one would actually change the color of the shed, but by not engaging in the pointless loop was able to let people get back to ignoring more important parts of the code, which is what they had been doing previously.

Determining What's Been Changed on RPM Based Systems

28 June 2009

Lifehacker's Firefox Add-On Packs

mime-util (Mime Detection Utility)

mime-util (Mime Detection Utility)
mime-util is a free, light weight, Open Source Mime Detection utility for Java.

Mime Detection has come a long way from it's beginnings as used for email content. The mime-util library is able to detect MIME Types from Files, InputStreams, URLs and byte arrays using plugable MimeDetectors utilising various extensible and customisable detection policies such as extension mapping, glob file name matching, magic data matching and other content sniffing techniques