Category: "ETL/EAI/ESB"

Reading Pentaho Kettle Solutions

On a rainy day, there's nothing better than to be sitting by the stove, stirring a big kettle with a finely turned spoon. I might be cooking up a nice meal of Abruzzo Maccheroni alla Chitarra con Polpettine, but actually, I'm reading the ebook edition of Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration on my iPhone.

Some of my notes made while reading Pentaho Kettle Solutinos:

…45% of all ETL is still done by hand-coded programs/scripts… made sense when… tools have 6-figure price tags… Actually, some extractions and many transformations can't be done natively in high-priced tools like Informatica and Ab Initio.

Jobs, transformations, steps and hops are the basic building blocks of KETTLE processes

It's great to see the Agile Manisto quoted at the beginning of the discussion of AgileBI. 

MuleCon2008 Closing Campground

The community has been brought together for the final campground session. Rather than sing kumbaya, we've been teased with t-shirts, toys and books from O'Reilly and being treated to a demo of a hot deployment by Travis Carlson. 'Tis all drag-and-drop goodness, with dragging jar files (bundles) as connectors and apps from a MacOSX Finder into Mule and watching the results in Ohhh! Ahhh! Safaris shows Mule saying "Hello Travis, welcome to MuleCon" B) OK, I'll admit that this is pretty neat and should satisfy the most addicted user of Tibco wizards. This won't appear in Mule Enterprise Edition until after the 2008Q3 release of Mule EE 2.0.

One little aside… I was talking to an ex-TIBCO employee who said that as old as TIBCO is (grew out of a Teknecron business that was founded in 1985 and became an independent business in 1997), the user even only draws about 400 people. This is only the second MuleCon [there was a sort-of pre-MuleCon in 2006 at a bar in London, but let's not count it), which last year drew around 100 folk and this year brought in around 240. That's phenomenal growth and shows the excitement that good software can bring to its users.

Questions for the Campground from Day 1 and some new ones.

  1. AMQP is an emerging messaging protocol and is platform independent. To answer the question about dot-Net integration with Mule, this is part of the answer.
  2. BPEL & BPM? Workflow tools, such as JBPM, that don't depend on simple web services, work better with Mule. Travis has done some work with this including the JBPM plugin that is in the current distribution of Mule, and there is more on the Forge. Mule also works well with Oracle and Iona's solutions. Mule Services and Mule Events are complementary.
  3. Mule is moving to Spring2, and has MuleSource thought about the exciting things that can be done with AOP in Spring2? Not so much yet, but it is among the next thing to look at. Mule Galaxy might be one of the first areas where MuleSource would leverage AOP to create runtime governance.
  4. Can Mule Galaxy can be used to manage other artifacts? The short answer is yes.
  5. How are folk clustering Mule? The typical way is to leverage an existing clustered app server. They are seeing more where the ESB has a state itself. They're looking at things like TerraCotta for such use cases.
  6. Discussions of rules engines and routing, but much work to be done, hopefully through the MuleForge.

Dave gave some closing remarks and that's all 'til next year.

MuleCon2008 Users Day 2

This first user track session is a panel, once again moderated by Michael Coté of Redmonk with John Davies, Technical Director and Head of Research at Iona, Eugene Ciurana of Leapfrog, John Rowell, CTO of OpSource, and John Gardner, Principal Consultant at MomentumSI. My take away from this is that Mule, perhaps open source in general, allows a company to better balance their risk as they test out the potential rewards while implementing tactically against strategic goals. A great metric from Eugene is that the most wait time he has ever had with Mule for a support issue is 90 minutes from opening a ticket to getting some sort of answer, and this is what keeps selling him on Mule vs proprietary solutions. Cost is a factor in ROI, and Mule wins out on not just initial costs but also costs less in need for training, on-going support, consulting, and maintenance; but beyond this, the flexibility of Mule and ability to integrate it with the infrastructure, and the openness not just of the code but in the company and personnel also adds to the return, as does the "cool" factor.

Having now seen Coté moderate two panels now, I have to say that he's one of the better moderators I've seen.

The second session is "A Quantum Leap in Ease of Use: Introducing Mule IDE 2.0". First let me say that having a nice, GUI IDE will help with the ROI perception, in that it is often the perception that development can be quicker and more efficient if done in a nice, pretty GUI. I happen to disagree, but I'm old and like CLI and typing more than clicking. Let's see what these guy's say. Yep, yep, yep… they're showing a pretty drag-and-drop GUI configuration editor. I took the one-day training at last year's MuleCon, and found that configuration through editing XML files to be very easy, and to help with understanding what Mule was doing. There was a contingent of attendees who complained that they didn't have their Tibco Wizard. This should help placate that contingent. Ah, the Mule IDE 2 is pre-alpha, though you can get it on the Forge. The release is expected to be in 2008Q3. The demonstration of the Mule IDE 2.0 plugin for Eclipse looks good; in addition to the GUI, one can still see the underlying XML.

John Davies, Technical Director and Head of Research of Iona spoke on Financial Messaging in Mule. The concept of losing a message in an investment is literally unthinkable. One interesting tidbit is that MS Excel is the only MS products one finds in an investment bank. John went over the front/middle/back office requirements, and the place where Mule is found is in the middle where XML is heavily used, usually over MQ & JMS. Using Artix, Mule, and GigaSpaces on an Azul box, provides essentially linear scaling and throughput is so high, performance is so great, that the need for transactions and supporting transactional relational databases is eliminated, thus increased performance allows increasing performance even more.

Mule Galaxy MuleHQ and Mule Saturn

The final session for today goes into System Management of Mule and the the additional products that mitigate risk in SOA deployments. This was definitely the most heavily attended of the user sessions.

MuleHQ is a dashboard that provides device-level management of app servers, databases and other assets. Policy driven alerts that allow management against the SLA is also provided. The server side performs auto-discovery of services and assets of which Mule is aware. Mule components, connectors, routers and endpoints are monitored by MuleHQ. MuleHQ is still Hyperic.

The presentation on Mule Saturn started with the challenges of integration, as well as supportability and maintainability of the integration solution. Mule Saturn provides a reduction in discovery of problems and recovery from them to allow meeting SLAs by "instrumenting Mule" to allow increased monitoring and proactive response through Mule Saturn and MuleHQ. Mule Saturn provides a view into the integration process and data flow for business users, whereas MuleHQ provides monitoring for the technical users. There is role based access to allow editing of the XML only by designated users.

The first use case provided was for a Supply Chain Integration. The examples considers a batch of POs containing two orders, one that succeeds and one that fails. Drill down through hyperlinks is possible. Essentially, the use case walked us through Mule Saturn.

The second use case went through the workflow of approving an employee's expense report with a similar walk-through of Mule Saturn showing a failure in the process where the manager never approved the expense resulting from an email failure to communicate. :>>

OpSource at MuleCon

OpSource provides web operations for SaaS companies. They deliver about 250 on-demand applications involving billions of transactions per day. The backend is Mule. OpSource using the best solutions to their challenges, open source or proprietary, which says a good deal about Mule.

Their opportunity is DATA. By creating a canonical data model, they have a single set of SQL and XML schmas for the data model resulting in the ability to create ODS and DW. The other opportunity is SERVICES for their customers and their customers' customers.

The solutions was to implement and ESB for Integration/SOA, which resulted in a fully multi-tenant version of the Mule ESB. OpSource Connect is reliable and scalable, and fully PCI compliant.

The use of Mule led to the discovery of Mule partner U1 Technologies and their Ambrosia messaging platform.

And they're hiring.

Mule eCommerce Case Study Online Pharmacy

The second session in the first user breakout is by Craig Sutter, Technology Director of VetSource.

VetSource deals with many challenges from the multitude of partners' and manufacturers' integration points to regulatory requirements. Existing platforms don't scale, and are often in older, proprietary Java frameworks.

The objective that led to VetSource choosing open source solutions is flexibility without vendor lock-in. One reason to use Mule is the ability to use POJO based services.

[I broke away real quick to return a call to a friend from back East. He'll be in SF on the 24th, so don't bother me that day.] &#59;)

Other reasons to use Mule:

  • Ease of use, very fast to prototype as proven in their pilots
  • Standards based but without requirements to use any particular standard
  • Support is available as needed

The entire platform is OSS: Xen VM, CentOS 5, MySQL 5, Wicket, Cayenne, Spring and an open source veterinary application. This approach, and the use of Mule allows them to reuse and repurpose older web applications and legacy services. Other services are basic POJOs exposed through a RESTful API. Craig mentioned that he would like to see some changes in the way REST is implemented in Mule, and then mumbled that he'll be talking to the Mule guys later today about that. The difference between this and what might happen would a customer do the same at a BigProprietaryVendor World conference, is that the Mule guys will listen. I also wouldn't be surprised if they implement those changes over the next several weeks, or provide a way for Craig's folk to do so in a maintainable fashion.

One example of the complexity involved is that they may pass an order for a drug that has special shipping and handling requirements directly to the the manufacturer or specialized distributor via EDI using a Mule connector.

Users on Mule

The afternoon sessions are split into a Developer Track and a User Track. I decided to attend the User Track. The first speaker was Shigemoto Fujikura of OGIS International speaking about the S2Mule project that they implemented. As part of an all encompassing project affecting dozens of system, the first target was an HRM system, with the first application being a mainframe skills management system that was being migrated to an ERP system. Microsoft Excel was used as a client by the executive staff against the old system, and they desired to continue to do so under the new system. By using SOA, they were able to define the system boundaries through the system interfaces, and were able to allow MS Excel to access the SOA network [SOAP] using VBA. They began the project with the 2005 open source Mule ESB project, as it allowed them to start small and execute trials to test out decisions quickly.

Searsar2 is a widely popular open source framework in Japan making Java dependency-injection simple with aspect oriented programming. The S2Mule project brings Mule into Seasar2.

Hiroshi Wada of the University of Massachusetts has been working with OGIS International to use modeling techniques to define non-functional properties in SOA separately from functional requirements to allow the reuse of services and connectors in different contexts. The Mule ESB has provided the test platform for their use of UML, business process modeling notation [BPMN], feature models and aspect oriented programming.

Partner and Customers at MuleCon2008

The rest of this first morning of MuleCon2008 seems to be devoted to partner and customer use cases. The afternoon will be breakout sessions separated into a user track and a developer track. This is in stark contrast to last year's MuleCon, where the first day was training on Mule, while the second day was filled with alternating sessions devoted to core developers talking about the roadmap and users telling their stories. This change, coupled with the higher attendance indicates to me the maturation of Mule and MuleSource.

Jahan Moreh, VP Technology of U1 Technologies spoke about high performance messaging using Mule. He started off giving talking about messaging standards such as Java Messaging Service (JMS) and Advanced message Queuing Protocol (AMQP). He gave a very good overview, almost a tutorial, on messaging in general and the trade-offs involved in high performance messaging.

Eugene Ciurana, Director of Systems Infrastructure at Leapfrog Enterprises, Inc. spoke on "Son of SOA: Resource Oriented Computing and Event Driven Architectures" PDF download. Eugene provided an entertaining look at the limitations of SOA and the challenges and rewards of implementing SOA at Leapfrog, from the perspective of Charlie the Farting Dog. Well, not quite, this was an example of the upcoming Tag from Leapfrog reading a book. Leapfrog is very driven to implement new systems in very short timeframes using best of breed components. Of most interest to me is the shift towards consuming resources, blurring the distinction between services and data sources. All components of a system are viewed as resources to be consumed synchronously or asynchronously. there is no distinction between data, objects or services and there is no dependency on a programming language or framework. Programs map logical and physical locations through identifiers in traditional computing models. Resource Oriented Computing defines resources through verbs and logical identifiers, much like REST, but not exactly. Eugene went into a good comparison of how Java, REST and ROC handle Resource, Identify, Resolve, Compute, Immutability. Eugene selected Mule ESB as his ROC Backbone (workflow, transactions, transformations, logging and routing) and for the Resource Container. Extending the web protocols to go beyond the four REST verbs is easy within Mule. The initial ROC implementation took about 60 days, with developers just writing POJOs and everything deployed as a single JAR file. The ESB backbone is actually as many identical implementations of Mule as he needs, thus he can scale the backbone horizontally as much as needed. To summarize Eugene's conclusion, complex systems are easier to code and maintain if implemented as small blocks which can be mapped as resources that can be consumed in a stateless fashion reducing implementation costs by 70% and maintenance costs by 30-40%.

Roy dela Paz of Biogen Idec [Nasdaq: BIIB] provided a "Customer Case Study: Move over Proprietary, Here's Come Mule". The BEA Weblogic Integration Platform v8.1 has been in use at BIIB for internal and external data integration since 2004, but the learning curve has been steep [very abstracted, proprietary, non-standard development environment], it's been unstable [Nagios scripts have helped here], and it hasn't scaled well. Mule came back into play through a "water cooler conversation" and investigation showed that Mule provided the required capabilities in a lightweight, stable, scalable package using standard XML and POJO development skills. Roy compared the complexity of Weblogic vs Mule as that of the Sideways Bicycle vs a normal single-speed bike.

MuleCon2008 Intro and Year in Review

Mahau, the Director of Marketing gave the introduction. There are more than double the number attendees this year as last. All members of the core development team are here.

Dave Rosenberg provided the year in review.

  • Four babies, all girls, have been born to the Mule "family" in the past year
  • MuleSource launched in Japan
  • Mule Galaxy for governance was released
  • Mule2.0 and Mule RESTPak coming soon

Dave began discussing that building good products is important, but building the right products is equally important. Customers have been successful in SOA when the use an ESB as a foundation. Governance [Mule Galaxy] is the next important step. Mule Saturn for data monitoring and MuleHQ for system monitoring & management round out the Mule offerings.

The MuleForge has a lot of new material from developers in 118 countries.

From the survey that MuleSource recently conducted with their users, while up-front cost is a consideration, the most important value from open source software is that the users get a feeling of control by having access the code.

When Dave talks about the growing ecosystem, it now includes over 40 partner companies.

Ross Mason, creator of Mule and CTO & co-founder of MuleSource provided the Mule Product Updates and Roadmap session.

A year ago, there was the mule project being transitioned from an open source project to an enterprise product, including the announcement of MuleHQ.

This year, the open source offerings are the Mule ESB, Mule Galaxy, Mule IDE and the MuleForge.

The Mule community edition will have major releases this quarter and another planned for the last quarter of this year and Ross presented the 2008 timeline.

Mule 2.0 has had a lot of architectural changes under the cover. More visible to the users will be schema-based configuration. There will be expression evaluation support [Xpath, XQuery, Groovy, Context information for routing, transformations, etc].

Mule Galaxy is a registry and runtime governance that is deeply integrated with the Mule ESB.

Mule IDE is Eclipse based.

Mule Enterprise 1.5 goes through an extensive QA, testing and platform certifications processes. It also includes MuleHQ that allows things like profiling, and Mule Saturn (beta) that allows visualization of the flow through Mule.

The community edition will have more frequent releases but will be potentially less stable.

Mule Enterprise 2.x will add some premium connectors, such as for Websphere MQ. Migration tools will be added to MuleHQ and Mule Saturn. The migration tools include even easier migration from Mule 1.x. One major change in 2.x is the addition of SOA Runtime, hot deployment of components. When one has a service running in Mule, and one changes the message format for a long running transaction, hot deployment allows a way to gracefully change over from the older to the newer format, without shutting down.

Mule Galaxy Enterprise provides for Clustering, Microsoft Office Indexes, PDF indexes, Workflow, Replication/federation, Premium (task based) documentation and QA.

Ross provided a few use cases.

The Mule IDE (promised to be fully functional and stable by the end of the year) allows one to create, debug and deploy new instances of Mule, while using Mule Galaxy to share artifacts and design time policies. While developing in the Mule IDE, one can extend the functions of the instance by getting components from the MuleForge.

Within Mule Galaxy, is something called Mule netboot, which is essentially a bootstrap node for Mule. Mule HQ can discover these nodes, which can be configured to download its configuration from a Mule Galaxy URL. Mule netboot then starts up Mule ESB. This is much simpler than the current Mule patch manager method of deployment.

Monitoring Mule with MuleHQ provides triggers and alerts for things like error and load conditions. Mule Saturn provide more of a business view, while Mule Galaxy provides the runtime governance as to who/what is using the service and how.

Ross is now providing a tour of MuleForge. One strong aspect of MuleForge is that it allows folk to create a connector, for example, and then release & manage it through the MuleForge. There are also proposals for projects, so that one can seek help in creating, or at least in determining interest in, certain connectors or extensions for Mule.

There is a certification process for projects

  1. Not Certified
  2. Community Certified
  3. Partner Certified
  4. MuleSource Certified - MuleSource will support that connector directly as part of a paid subsciption

MuleSource is now doing targeted distributions from MuleForge, for example, a REST pack that allows for ATOM publication that provides not just a code bundle, but also instructions, descriptions and definitions on using the pack.


As noted in our pre-OSTT thoughts, and in some of our OSBC notes, that over there have been some holes in the open source offerings, most notably around governance. The release of Mule Galaxy will be filling this hole. From last year's MuleCon I had noticed that there was a dichotomy among the users after the training, some who embraced the XML editing for configuring Mule and others who missed their wizards. The schema-based configuration and the seamless integration of the Mule IDE should help to satisfy both camps.

Dave came back up to garner suggestions for the ending campground. The audience suggested:

  1. .Net integration
  2. EDI transformations and using the MuleForge
  3. Dynamic transformations
  4. performance and test authoring for the full life-cycle
  5. migration from 1.x to 2.x
  6. using the IDE
  7. Internationalization
  8. BPEL
  9. use cases for Mule Galaxy including non-Jackrabbit, JBPM & Alfresco
  10. clustering and high availability
  11. Integrating various ESB/SOA/integration product lines and can Mule be used to bring a "one of everything" shop together

MuleSource Buys Microsoft, Oracle and Google

In a move that was anticipated by industry insiders for several months now, MuleSource CEO Dave Rosenberg announced at the opening breakfast for MuleCon2008 that MuleSource had made successful hostile bids for Microsoft, Oracle and Google - pending SEC approval, which is expected soon.

As part of his OSS (Open Source Swashbuckling) strategy, Dave laid out his roadmap for each of the three, soon-to-be-absorbed companies.


Microsoft will be dismantled, as it has obviously outlived its usefulness. The main value from the company is in its many developers and .Net community, all of whom will be set free to work on innovative products in the open source realm.


Dave admitted that the only reason for the Oracle purchase was to obtain access to the BEA Tuxedo, providing Mule with much needed COBOL connectors. Oracle database business has been undermined in recent years by MySQL for dot-com and Web2.0 applications, PostgreSQL for enterprise transactions, and LucidDB for BI. The applications business is increasingly being owned by Compiere [pronounced in the Italianite way] and OpenBRAVO. Partnerships are being planned with these open source companies.


Google has lived off of open source projects and will now be giving back in a big way. As soon as the acquisition is complete, all Google code will be released under AGPLv3. All Google's products will be redesigned to conform to a new SOA and MDM strategy, to allow for even more efficient enterprise mashups.


What more is there to say? Get ready to ride the Mule, and welcome to the new world order.

|-| &#59;D

July 2020
Mon Tue Wed Thu Fri Sat Sun
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
 << <   > >>

At the beginning, The Open Source Solutions Blog was a companion to the Open Source Solutions for Business Intelligence Research Project, and book. But back in 2005, we couldn't find a publisher. As Apache Hadoop and its family of open source projects proliferated, and in many ways, took over the OSS data management and analytics world, our interests became more focused on streaming data management and analytics for IoT, the architecture for people, processes and technology required to bring value from the IoT through Sensor Analytics Ecosystems, and the maturity model organizations will need to follow to achieve SAEIoT success. OSS is very important in this world too, for DMA, API and community development.

37.652951177164 -122.490877706959


  XML Feeds