01 October 2008

Smooks looks like a useful java library for processing data files

Milyn - Smooks
Smooks is a Java Framework/Engine for processing XML and non XML data (CSV, EDI, Java etc).

Smooks can be used to:

* Perform a wide range of Data Transforms - XML to XML, CSV to XML, EDI to XML, XML to EDI, XML to CSV, Java to XML, Java to EDI, Java to CSV, Java to Java, XML to Java, EDI to Java etc.
* Populate a Java Object Model from a data source (CSV, EDI, XML, Java etc). Populated object models can be used as a transformation result itself, or can be used by (e.g.) Templating resources for generating XML or other character based results. Also supports Virtual Object Models (Maps and Lists of typed data), which can be used by EL and Templating functionality.
* Process huge messages (GBs) - Split, Transform and Route message fragments to JMS, File, Database etc destinations.
* Enrich a message with data from a Database, or other Datasources.
* Perform Extract Transform Load (ETL) operations by leveraging Smooks' Transformation, Routing and Persistence functionality.


Smooks supports both DOM and SAX processing models, but adds a more "code friendly" layer on top of them. It allows you to plug in your own "ContentHandler" implementations (written in Java or Groovy), or reuse the many existing handlers.