BI To Build or Buy or Open Source

As we prepare for our talk at Campus Technology 2007, we're considering some of the topics we might cover.

First off... "What is BI?" Even on WikiPedia, this is a topic of conversation, as you can see by clicking on the "discussion" tab of the Business Intelligence article. For our purposes, we'll define business intelligence as business processes and the integrated applications or individual tools that are implemented specifically to provide data management, reporting and analytics to solve specific business and user community needs.

As with other IT initiatives, a BI program can be built from scratch, built from an existing framework, such as the open source Eclipse BIRT or the proprietary tools from SAS, purchased as a COTS product, bought as SaaS, obtained as an appliance or implemented from Open Source projects.

I can't imagine an IT shop that hasn't faced the decision to build or buy, and then to decide how to build or what to buy. And many IT shops have developed a culture that leans one way or another. Today, however, there are some new wrinkles... new variables in the decision making equation. There are more options than ever before:

  • more language options,
  • more vendors and more vendor consolidation
  • more libraries,
  • more open standards,
  • more architectures,
  • more services, and
  • more concerns.

These options provide more flexibility in meeting organizational and user needs, but more challenges in leading the way to a responsive, secure, compliant, cost effective and maintainable information infrastructure.

Some of these options are fairly new, and can have wide-ranging impact for internal standards and procedures going forward. Among these new options, especially in BI strategies, are Appliances, SaaS [not SAS] &#59;) and Open Source Solutions. Even more confusing, some appliances use open source software components, some are strictly proprietary in all their parts, and some are a mix; while some open source vendors make their software available only under an approved open source license, and others use dual licenses: open source and proprietary, sometimes with different features or different names for the different licensed versions. Oh, and we're not even going to touch the myriad of open source licenses out there, nor the debate over whether or not certain licenses or certain vendors are "really truly open source".

Every situation is different, and we're not going to try to solve all the world's problems, or even BI strategies in a simple blog post.

One point that I would like to make is that BI is, IMNSHO, especially well suited to open source solutions. At their best, proprietary solutions still only give one a starting point, which must be customized for the source system customizations and uniqueness inherent in each IT shop, as well as the specific business processes, organizational considerations and user desires that are driving the BI program. Here are some examples of what I mean.

  • A BI solution is likely to take from a variety of source systems [financial, human resources, full ERP, grant management, housing, student records, large legacy software, new enterprise applications, small local databases and spreadsheets], and may have different real-time and historical requirements than those source systems. You may be considering implementing SOA throughout your enterprise, and exposing some or all of those potential source systems as services. Thus, do you connect directly or do you need ETL, or maybe a mix of ETL and ESB, to get your source data to your BI system? Whether you purchase a proprietary system from Informatica or Oracle or Data Stage or Tibco or BEA or IBM, or download an open source solution such as KETTLE or KETL or Jetstream or Mule or ServiceMix, you can't just install the software, auto-magically connect to all of your source systems and have a nice working system at the end of the day.
  • Any reporting tool just gives you the ability to design and format reports. There probably aren't any nice canned reports, out of the box, that will make your users dance in the hallways. So, do you buy Crystal Reports, or do you download JasperReports, jFreeReports, or openReports?
  • The same can be said and asked for analytical tools such as OLAP and MDDB, predictive tools, data mining and dashboards.

The point is that even for small organizations, or simple reporting needs, the best that you get from proprietary vendors are the tools to create the BI solutions needed, and the best that you get from open source solutions is exactly the same, plus the actual source code to customize if need be and a community to help share the pain. Proprietary BI solutions can cost hundreds of thousands to millions of dollars or euros or marks or credits in licensing costs plus consultants, plus programmers, plus hardware, plus support. Open source is much the same on the plus cost, but greatly reduced in the licensing cost, sometimes to zero. The bonus may be slicker wizards to help your programmer-analysts deliver the solution in a proprietary vs. open source case, or greater flexibility and control in the open source vs. proprietary case. The trade off here is the same as those old buy IBM or build COBOL solutions we once considered: what will get the project delivered on-time, within-budget and meeting the specifications?

Again, we're not trying to present the ultimate solution here, but hopefully, we've opened up the possibilities. So, take the lead, and generate your strategy one step at a time. I have to run, but I'll come back later to fill in some links. For open source projects and vendors, take a look at our linkblog in the side column.

Pentaho on Web2-roids

As a great example of an open source solution adding value and innovation beyond that of proprietary vendors, Pentaho announced today the availability of new libraries allowing developers to make full use of AJAX for the quick, rich and lightweight interface common to Web2.0 applications. Combine this with the Pentaho BI | Google mashup and you see the wide-open possibilities of open source data analytics.

Pentaho AJAX leverages the capabilities of AJAX technologies to allow development of highly interactive, lightweight BI applications that increase speed and responsiveness for end users by providing a thin-client interface as well as minimizing browser refreshes and round trips to the server. Critical data is cached in the browser using HTML and XML, and JavaScript techniques are used to alter the user interface without re-drawing the page or requesting additional data from the database.
end quotation
-- Pentaho Open Source Business Intelligence: News Release "Pentaho Announces General Availability of Pentaho AJAX"

Open Solutions Alliance

The Open Solutions Alliance was formally announced on 2007 February 13 at the Linuxworld Open Solutions Summit being held in New York City, NY, USA. In their own words...

"The Open Solutions Alliance will seek to expand the market for business open source software solutions through cooperative action that increases awareness of member solutions, reduces barriers to customer adoption, facilitates interoperability, and explains the benefits of open source solutions."end quotation
-- Open Solutions Alliance "Our Mission"

Among the founding members are OSBI solution vendors JasperSoft [BI Suite], Talend [ETL], and EnterpriseDB [RDBMS]. One goal of the organization is to improve interoperability among the members' products. In an example given by Michael Harvey, chief marketing officer of CentricCRM, the work could make it easier for customers using their customer-relations software to use JasperSoft's data-mining software to plumb customer records for particular information.

Ease of, or even "pre-packaged" interoperability among applications such as CRM, BI, RDBMS, systems management [Hyperic is a founding member, with Groundwork making a membership announcement today] and collaboration will ease deployment considerably. If, of course, the member projects provide the solutions required by the business users.

During the rush in the late 90's to deploy large, integrated packages such as provided by Oracle, Peoplesoft [now Oracle] and SAP, CIOs saw the folly of trying to impose business processes on users that matched the software, or the cost of customizing the software in a, usually poor, attempt to match the business processes. Indeed, this is one reason that we are so excited by open source solutions: the software can be much more easily adapted to users' best practices, thus reducing the learning curve and increasing adoption by the users.

I believe that the Open Solutions Alliance will be, overall, good for open source communities everywhere. However, I don't believe that its mission will significantly impact the growth of open source, nor bring its members' offerings into better alignment with all those different user needs out there.

From my feed reader, here's a list of other opinions on the Open Solutions Alliance.

Campus Technology 2007

Mary Grush has invited Clarise and I me to speak at the Campus Technology 2007 conference to be held in Washington, D.C., USA. We'll be speaking on Wednesday, 2007 August 1 at 11:15am-12:15pm. In general, institutes of higher learning are only beginning to explore data warehousing and business intelligence technologies, and, in general, they don't like what they're seeing from traditional, proprietary vendors. From our initial conversations with Mary, here's our direction. We'll develop this here in the OSS Blog as much as we can. We would really appreciate any comments to help us refine our talk.

Cost Effective BI/DW Strategy

Abstract

Our strategy for reporting, data management and analysis programs and projects responds to user needs quickly without blowing the budget. Using open source software, project management, and user involvement, this strategy economically and efficiently meets campus-wide and departmental data warehouse, data mart, and business intelligence needs through dashboards, reporting, OLAP, and data mining tools. Cost effective results can be in user's hands in as little as one week.

Points to be covered
  1. A framework leading to an economical strategy/BI-roadmap for data warehousing, data management or data analytics programs
  2. Program, Project & Risk Management methods
  3. Risk and advantages of using open Source Solutions for BI suites or DW/data mining components such as ETL/EAI/ESB, RDBMS & MDDB, meta data management, reporting, OLAP engines, multi-variate analysis (a.k.a. "slice & dice"), machine learning, portals, and dashboards
  4. User involvement for determining specifications and implementing quality control
  5. Costing, value and return
Take-away Points
  1. Strategy and tactics should be separated with a clear iteration plan for quick, economical response to user & organizational needs
  2. Agile development doesn't mean a lack of project management nor should it allow scope creep
  3. Open source software has matured to the point where it can certainly be used for prototyping and even production

Three Most Important Open Source Projects

As we've noted in our OSBI Daily links for yesterday and today [wiki archive or lens archive], two events are showing the growing importance of open source solutions. The first is a survey being conducted by Nat Torkington. The second is the inclusion of several open source enterprise software packages in the 17th Annual Jolt Award nominations.

... "a theory and I need numbers to prove or disprove it. If you use open source, please tell me in the comments what you think are the three most important open source projects going today. I'll post my hypothesis, the numbers, and my conclusion next week." end quotation
-- Nat Torkington in Survey: Three Most Important Open Source Projects at O'Reilly Radar

The open source BI suite, Pentaho, and the open source ESB, Mule, are included in the Jolt award nominations. I've included these two, as well as the open source and SaaS collaboration and messaging suite Zimbra, as my three "picks" for Nat's survey. It's a tough choice; there are many competing and complementary projects that fit as well. Any of the open source BI solutions that we track for our OSBI research could be a top three choice. Just take a look at our linkblog or the link modules on our lens. In bringing collaboration to all, Alfresco and other ECM open source projects may have as much impact as Zimbra or open source competitors to Microsoft Exchange such as Open-Xchange Server.

Most of the choices in the close to 200 responses that Nat had received when I made this trackback are more of the infrastructure and developer tool variety. I think that open source solutions have moved well beyond that niche, and that the open source movement is ready to capitalize on the explosion in open source projects, VC funding and businesses that occurred in 2004 through 2006. This mini-bubble is contracting a bit leading to some consolidation and a lot of stability in business models and business growth, and in community definition, contribution and support. At the same time, controversy still abounds around licensing [GPLv3 and MPL+Attribution] and what commercial open source really means. But it's all good, and it all indicates that the open source movement is maturing.

What are your choices for the three most important open source projects? Get over to O'Reilly Radar and make your opinion heard.

Update to our Palo Description

We were contacted by Stephanie Endlich of Jedox GmbH, who suggested a better description for Palo:

"Palo is a memory based Open Source MOLAP which is able to consolidate data hierarchies in real time and which supports write-back of data. It provides a MDDB and a free Microsoft Excel add-in. For flexible integration Palo offers APIs for Java, PHP, C and .NET. From their homepage... 'Palo is an advanced data store for Microsoft Excel that allows you to handle large amounts of Excel data on a small number of worksheets. In addition, it also allows you to share Excel data real-time with your colleagues.'"end quotation

We also have the following from their website on Palo Basics.

"Palo is made for Microsoft Excel. It is a cell-based database that is multidimensional, hierarchical and memory-based. Now what does those terms mean in particular?

"Trying to do Business Intelligence, Financial Analysis, Budgeting or Planning with Microsoft Excel? Looking at Excel workbooks that are difficult to maintain because of their size? Then Palo is for you.

"Palo is an advanced data store for Microsoft Excel that allows you to handle large amounts of Excel data on a small number of worksheets. In addition, it also allows you to share Excel data real-time with your collegues. You get exciting new Excel features without loosing Excel's flexibility. Works with your existing Excel 2000/XP/2003."end quotation

Let us know what you think in comments, and we'll update our OSBI Lens' linkblog "Links to OSS OLAP Tools" description accordingly.

Bloggers Blogging Open Source

I recently discovered that Julian Hyde started blogging in August of 2006. Julian is the lead developer of the Mondrian open source OLAP engine, one of the first open source projects related to BI. This prompted me to publish our OPML file of Open Source Solutions blogs and news feeds [ctrl-click, two-finger-clicktap, right-click, or whatever you do to get the context menu on your machine and save that link].

We hope that you find it useful. If you know of other related blogs, please provide their feeds and URIs in the comments, and we'll update and republish the OPML.

OpenRPT 2.0 Available for Download

OpenMFG announced the availability of OpenRPT 2.0 on 2006 December 20. From their forum...

"OpenRPT 2.0 was retooled from the ground up to take full advantage of the underlying Qt 4 framework. Qt is an open source, comprehensive development framework that includes an extensive array of features, capabilities and tools that enable development of high-performance, cross-platform rich-client and server-side applications. Among the new OpenRPT features enabled by updating from Qt 3 to Qt 4 is the ability to export to Adobe PDF natively for both the OpenRPT API and the rendering application.

"Other new features include support for generic ODBC database connections (the first release only supported the open source PostgreSQL database - and native PostgreSQL support continues in 2.0); enhanced page break functionality; and power tools for developers such as command-line arguments and parameters, multiple document printing, and advanced error trapping.

"Since it was released as an open source project in 2005, OpenRPT has been downloaded over 30,000 times..."end quotation
-- OpenRPT report writer and renderer

Current Version Confusion for JasperReports iReports

We're a bit confused as to what is the current version of JasperSoft's JasperReports and iReport. SourceForge.net has version 1.3.0 of both JasperReports and iReport available for download, but with the disclaimer that...

JasperReports, the market leading open source business intelligence and reporting engine. This project is being moved to http://www.jasperforge.org/. This project is the home for all things Jasper, Reports, Analysis, Server, and Intelligence.end quotation
-- JasperReports Java Reporting home on SourceForge

The only downloads from JasperForge seem to be for documentation. Going to JasperSoft's site, there is only reference to the November release of 1.2.8, but the downloads page, after a free registration, brings you back to SourceForge.net and the 1.3.0 versions of the software.

I guess the holidays are just delaying the official release announcements. Enjoy.

OSBI for Campus Technology Magazine

I was just interviewed by Bennett Voyles for Campus Technology Magazine. The topic was information about Open Source Business Intelligence, and the article is in response to interest from their target audience, CIOs in higher education. We covered some general topics, like a timeline of OSBI evolution and the types of BI tools available, as well as some specific examples of open source BI tools solving user needs. We discussed the differences between proprietary and open source software in areas such as development, test, community, feature inclusion, selection process, security and stability.

From Ben's description, the article will cover more than our interview and topics of discussion. I'm looking forward to reading it. We'll link to it, when it comes out.

May 2019
Mon Tue Wed Thu Fri Sat Sun
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
 << <   > >>

At the beginning, The Open Source Solutions Blog was a companion to the Open Source Solutions for Business Intelligence Research Project, and book. But back in 2005, we couldn't find a publisher. As Apache Hadoop and its family of open source projects proliferated, and in many ways, took over the OSS data management and analytics world, our interests became more focused on streaming data management and analytics for IoT, the architecture for people, processes and technology required to bring value from the IoT through Sensor Analytics Ecosystems, and the maturity model organizations will need to follow to achieve SAEIoT success. OSS is very important in this world too, for DMA, API and community development.

37.652951177164 -122.490877706959

Search

  XML Feeds