Category: "ETL/EAI/ESB"

KETL First Meeting

Clarise and I met with Marshall Stevenson, Juan Carlos Rojas and Nicholas E. C. Wakefield of Kinetic Networks, the company behind KETL. The open source KETL Framework is the ETL tool providing data gathering and data quality for the Bizgres Clickstream initiative.

Marshall, Clarise and I are all ex-Oracle, and worked together at a boutique consultancy eight years ago. In addition, Marshall has been specializing in BI solutions since then, and has even provided us with ETL specialists as supplemental staff for our own projects. Marshall recently joined Kinetic Networks as their VP of Sales, and set up this meeting.

With KPMG and IBM in his background, Juan founded Kinetic Networks in 1999, and is currently serving as the CEO. Kinetic Networks provided a data warehousing platform in a Software as a Service environment, as well as doing Business Intelligence System Integration; delivering BI solutions with a heavy emphasis on operational excellence.

After graduating from Oxford University, Nick worked at EDS and MicroStrategy, before joining Kinetic Networks in 2000. Based upon Nick's work, and their success with KETL the company split its focus between System Integration and Technology development earlier this year.

A custom DW prototype for a client led to the first iterations of KETL. Under Nick's guidance it soon expanded by adding XML capabilities, and a plugin architecture using their own Java classes. From their SI group's best practices for Data Quality, they added the capability for data QA on the fly. It wasn't long before they found a customer who wanted the tool, more than System Integration, and a product strategy came about.

KETL has been used in production environments for over three years. Some of their experiences include

  • customers who have used KETL, because the cost savings on the "backend" ETL tool allowed for a better front end Reporting, OLAP and Portal solution
  • running KETL alongside commercial ETL, with a custom plugin allowing KETL to perform data quality for an existing install
  • in its open source form KETL is capable of handling 100’s of millions of record
  • use of KETL in production environments has driven its development

"Performance is tuned in areas that are important to our clients to date. The ability to read multiple files across multiple devices in parallel, using multiple CPU’s is something KETL can do relatively easily, our clients wouldn not have used KETL extensively otherwise.

"Historically KETL has been used for commercial implementations along with services; packaging has not been the focus. Over time this will improve." end quotation
-- Nick Wakefield, CTO Kinetic Networks

KETL was made from the beginning to handle very large data sets. This, coupled with its ability to perform Statistical Analysis for on-the-fly data quality, makes KETL a perfect tool for Clickstream analysis. Based upon this capability, and working with Greenplum at one customer led to the open sourcing of the KETL Framework, and its inclusion in the Bizgres stack.

Any company wishing to open source their product must decide how best to join the open source community. Choosing a license was very difficult. The KETL Framework is licensed under the LGPL. This allowed them to build commercial components upon it. The MPP and Data Quality modules are licensed commercially through an upgrade path.

Another choice to be made involves building a community around KETL. Currently, KETL relies on the Bizgres community. However, KETL works against any target database, and through its plugin architecture, can work with any source system. Thus, Kinetic Networks is looking towards building its own community.

Kinetic Networks has grown organically since its inception, without requiring any external funding. As such it relies on its System Integration group, including remote management of customer business intelligence solutions, for cash flow, as it grows its open source and product strategy.

As we proceed with our Open Source Business Intelligence book project, we're planning to stay in touch with Marshall, Nick and Juan, doing more interviews with them, and even a podcast or two. Bernard will be applying his Open Source Maturity Model, and we're looking forward to playing with KETL in the lab. We'll keep you posted.

While Kinetic Networks is finalizing their open source strategy, there are no links directly from the Kinetic Networks website to information about KETL. However, Nick has graciously provided the following links for the convenience of our readers. [As per the comment below, the following documentation is no longer available. Please check the documentation section of]

  • KETL for Data Integration [PDF]
  • KETL XML Steps Overview [PDF]
  • KETL Training [PDF]
  • allation Guide [PDF]
  • KETL System Operations Guide [PDF]
  • Documentation related to Bizgres and the stack
  • Software, currently its part of Bizgres Clickstream only

October 2019
Mon Tue Wed Thu Fri Sat Sun
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      
 << <   > >>

At the beginning, The Open Source Solutions Blog was a companion to the Open Source Solutions for Business Intelligence Research Project, and book. But back in 2005, we couldn't find a publisher. As Apache Hadoop and its family of open source projects proliferated, and in many ways, took over the OSS data management and analytics world, our interests became more focused on streaming data management and analytics for IoT, the architecture for people, processes and technology required to bring value from the IoT through Sensor Analytics Ecosystems, and the maturity model organizations will need to follow to achieve SAEIoT success. OSS is very important in this world too, for DMA, API and community development.

37.652951177164 -122.490877706959


  XML Feeds