Category: "software"

Setting up the Server for OSS DSS

The first thing to do when setting up your server with open source solutions [OSS] for a decision support system [DSS] is to check all the dependencies and system requirements for the software that you're installing.

Generally, in our case, once you make sure that your software will work on the version of your operating system that you're running, the major dependency is Java. Some of the software that we're running may have trouble with openJDK, and others may require the Java software development kit [JDK or Java SDK], and not just the runtime environment [JRE]. For example, Hadoop 0.20.2 may have problems with openJDK, and versions before LucidDB 0.9.3 required the JDK. Once upon a time, two famous database companies would issue system patches that we're required for their RDBMS to run, but would break the other, forcing customers to have only one system on a host. A true pain for development environments.

Since I don't know when you'll be reading this, or if you're planning to use different software than I'm using, I'm just going to suggest that you check very carefully that the system requirements and software dependencies are fulfilled by your server.

Now that we're sure that the *Nix or Microsoft operating system that we're using will support the software that we're using, the next step is to set up a system user for each software package. Here's examples for a *Nix operating systems: Linux kernel 2.x derived and the BSD derived, MacOSX. I've tested this on Red Hat Enterprise Linux 5, OpenSUSE 11, MacOSX 10.5 [Leopard] and 10.6 [Snow Leopard].

On Linux, at the command line interface [CLI]:

useradd -c "name your software Server" -s /bin/bash -mr USERNAME
- c COMMENT is the comment field used as the user's full name
-s SHELL defines the login shell
-m create the home directory
-r create as a system user

Likely, you will need to run this command through sudo, and may need the full path:


Change the password

sudo passwd USERNAME

Here's one example, setting up the Pentaho system user.

poc@elf:~> sudo /usr/sbin/useradd -c "Pentaho BI Server" -s /bin/bash -mr pentaho
poc@elf:~> sudo passwd pentaho
root's password:
Changing password for pentaho.
New Password:
Reenter New Password:
Password changed.

On the Mac, do the following

vate:~ poc$ sudo dscl /Local/Default -create /Users/_pentaho RealName "PentahoCE BI Server" UserShell /bin/bash
vate:~ poc$ sudo sudo passwd _pentaho
Changing password for _pentaho.
New Password:
Reenter New Password:
Password changed.
vate:~ poc$

On Windows you'll want to set up your server software as service, after the installation.

If you haven't already done so, you'll want to download the software that you want to use from the appropriate place. In many cases this will be Sourceforge. Alternate sources might be the Enterprise Editions of Pentaho, the DynamoBI downloads for LucidDB, SQLstream, SpagoWorld, The R-Project, Hadoop, and many more possibilities.

Installing this software is no different than installing any other software on your particular operating system:

  • On any system you may need to unpack an archive indicated by a .zip, .rar, .gz or .tar file extension. On Windows & MacOSX you will likely just double-click the archive file to unpack it. On *Nix systems, including MacOSX and linux, you may also use the CLI and a command such as gunzip, unzip, or tar xvzf
  • On Windows, you'll likely double-click a .exe file and follow the instructions from the installer.
  • On MacOSX, you might double-click a .dmg file and drag the application into the Applications directory, or you'll do something more *Nix like.
  • On Linux systems, you might, at the CLI, execute the .bin file as the system user that you set up for this software.
  • On *Nix systems, you may wish to install the server-side somewhere other than a user-specific or local Applications directory, such as /usr/local/ or even in a web-root.

One thing to note is that most of the software that you'll use for an OSS DSS uses Java, and that the latest Pentaho includes the latest Java distribution. Most other software doesn't. Depending on your platform, and the supporting software that you have installed, you may wish to point [softwareNAME]_JAVA_HOME to the Pentaho Java installation, especially if the version of Java included with Pentaho meets the system requirements for other software that you want to use, and you don't have any other compatible Java on your system.

For both security, and a to avoid any confusion, you might want to change the ports used by the software you installed from their defaults.

You may need to change other configuration files from their defaults for various reasons as well, though I generally find the defaults to be satisfactory. You may need to install other software from one package into another package, for compatibility or interchange. For example, if you're trying out, or if you've purchased, Pentaho Enterprise Edition with Hadoop, Pentaho provides Java libraries [JAR files]and licenses to install on each Hadoop node, including code that Pentaho has contributed to the Hadoop project.

Also remember that Hadoop is a top-level Apache project, and not usable software in and of itself. It contains subprojects that make it useful:

  • Hadoop Commons containing the utilities that support all the rest
  • HDFS - the Hadoop Distributed File System
  • MapReduce - the software framework for distributed processing of data on clusters

You may also want one or more of the other Apache subprojects related to Hadoop:

  • Avro - a data serialization system
  • Chukwa - a data collection system
  • HBase - a distributed database management system for structured data
  • Hive - a data warehouse infrastructure
  • Mahout - a data mining library
  • Pig - an high-level data processing language for parallelization
  • Zookeeper - a coordination service for distributed applicaitons

Reading Pentaho Kettle Solutions

On a rainy day, there's nothing better than to be sitting by the stove, stirring a big kettle with a finely turned spoon. I might be cooking up a nice meal of Abruzzo Maccheroni alla Chitarra con Polpettine, but actually, I'm reading the ebook edition of Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration on my iPhone.

Some of my notes made while reading Pentaho Kettle Solutinos:

…45% of all ETL is still done by hand-coded programs/scripts… made sense when… tools have 6-figure price tags… Actually, some extractions and many transformations can't be done natively in high-priced tools like Informatica and Ab Initio.

Jobs, transformations, steps and hops are the basic building blocks of KETTLE processes

It's great to see the Agile Manisto quoted at the beginning of the discussion of AgileBI. 

BayAreaUseR October Special Event

Zhou Yu organized a great special event for the San Francisco Bay Area Use R group, and has asked me to post the slide decks for download. Here they are:

No longer missing is the very interesting presentation by Yasemin Atalay showing the difference in plotting analysis using the Windermere Humic Aqueous Model for river water environmental factors, without using R and then the increased in variety and accuracy of analysis and plotting gained by using R.

Technology for the OSS DSS Study Guide

'Tis been longer than intended, but we finally have the technology, time and resources to continue with our Open Source Solutions Decision Support System Study Guide (OSS DSS SG).

First, I want to thank SQLstream for allowing us to use SQLstream as a part of our solution. As mentioned in our "First DSS Study Guide" post, we were hoping to add a real-time component to our DSS. SQLstream is not open source, and not readily available for download. It is however, a co-founder and core contributer to the open source Eigenbase Project, and has incorporated Eigenbase technology into its product. So, what is SQLstream? To quote their web site, "SQLstream enables executives to make strategic decisions based on current data, in flight, from multiple, diverse sources". And that is why we are so interested in having SQLstream as a part of our DSS technology stack: to have the capability to capture and manipulate data as it is being generated.

Today, there are two very important classes of technologies that should belong to any DSS: data warehousing (DW) and business intelligence (BI). What actually comprises these technologies is still a matter of debate. To me, they are quite interrelated and provide the following capabilities.

  • The means of getting data from one or more sources to one or more target storage & analysis systems. Regardless of the details for the source(s) and the target(s), the traditional means in data warehousing is Extract from the source(s), Transform for consistency & correctness, and Load into the target(s), that is, ETL. Other means, such as using data services within a services oriented architecture (SOA) either using provider-consumer contracts & Web Service Definition Language (WSDL) or representational state transfer (ReST) are also possible.
  • Active storage over the long term of historic and near-current data. Active storage as opposed to static storage, such as a tape archive. This storage should be optimized for reporting and analysis through both its logical and physical data models, and through the database architecture and technologies implemented. Today we're seeing an amazing surge of data storage and management innovation, with column-store relational database management systems (RDBMS), map-reduce (M-R), key-value stores (KVS) and more, especially hybrids of one or several of old and new technologies. The innovation is coming so thick and fast, that the terminology is even more confused than in the rest of the BI world. NoSQL has become a popular term for all non-RDBMS, and even some RDBMS like column-store. But even here, what once meant No Structured Query Language now is often defined as Not only Structured Query Language, as if SQL was the only way to create an RDBMS (can someone say Progress and its proprietary 4GL).
  • Tools for reporting including gathering the data, performing calculations, graphing, or perhaps more accurately, charting, formating and disseminating.
  • Online Analytical Processing (OLAP) also known as "slice and dice", generally allowing forms of multi-dimensional or pivot analysis. Simply put, there are three underlying concepts for OLAP: the cube (a.k.a. hypercube, multi-dimensional database [MDDB] or OLAP engine), the measures (facts) & dimensions, and aggregation. OLAP provides much more flexibility than reporting, though the two often work hand-in-hand, especially for ad-hoc reporting and analysis.
  • Data Mining, including machine learning and the ability to discover correlations among disparate data sets.

For our purposes, an important question is whether or not there are open source, or at least open source based, solutions for all of these capabilities. The answer is yes. As a matter of fact, there are three complete open source BI Suites [there were four, but the first, written in PERL, the Bee Project from the Czech Republic, is no longer being updated]. Here's a brief overview of SpagoBI, JasperSoft, and Pentaho.

Capability SpagoBI JasperSoft Pentaho
ETL Talend Talend
Reporting BIRT
Analyzer jPivot
OLAP Mondrian Mondrian Mondrian
Data Mining Weka None Weka

We'll be using Pentaho, but you can use any of the these, or any combination of the OSS projects that are used by these BI Suites, or pick and choose from the more than 60 projects in our OSS Linkblog, as shown in the sidebar to this blog. All of the OSS BI Suites have many more features than shown in the simple table above. For example, SpagoBI has good tools for geographic & location services. Also, JasperSoft Professional and Enterprise Editions have many features than their Community Edition, such as Ad Hoc Reporting and Dashboards. Pentaho has a different Analyzer in their Enterprise Edition than either jPivot or PAT, Pentaho Analyzer, based upon the SaaS ClearView from the now-defunct LucidEra, as well as ease-of-use tools such as an OLAP schæma designer, and enterprise class security and administration tools.

Data warehousing using general purpose RDBMS systems such as Oracle, EnterpriseDB, PostrgeSQL or MySQL, are gradually giving way to analytic database management system (ADBMS), or, as we mentioned above, the catch-all NoSQL data storage systems, or even hybrid systems. For example, Oracle recently introduced hybrid column-row store features, and Aster Data has a column-store Massive Parallel Processing (MPP) DBMS|map-reduce hybrid [updated 20100616 per comment from Seth Grimes]. Pentaho supports Hadoop, as well as traditional general purpose RDBMS and column-store ADMBS. In the open source world, there are two columnar storage engines for MySQL, Infobright and Calpont InfiniDB, as well as one column-store ADBMS purpose built for BI, LucidDB. We'll be using LucidDB, and just for fun, may throw some data into Hadoop.

In addition, a modern DSS needs two more primary capabilities. Predictives, sometimes called predictive intelligence or predictive analytics (PA), which is the ability to go beyond inference and trend analysis, assigning a probability, with associated confidence, or likelihood of an event occurring in the future, and full Statistical Analysis, which includes determining the probability density or distribution function that best describes the data. Of course, there are OSS projects for these as well, such as The R Project, the Apache Common Math libraries, and other GNU projects that can be found in our Linkblog.

For statistical analysis and predictives, we'll be using the open source R statistical language and the open standard predictive model markup language (PMML), both of which are also supported by Pentaho.

We have all of these OSS projects installed on a Red Hat Enterprise Linux machine. The trick will be to get them all working together. The magic will be in modeling and analyzing the data to support good decisions. There are several areas of decision making that we're considering as examples. One is fairly prosaic, one is very interesting and far-reaching, and the others are somewhat in between.

  1. A fairly simple example would be to take our blog statistics, a real-time stream using SQLstream's Twitter API, and run experiments to determine whether or not, and possibly how, Twitter affects traffic to and interaction with our blogs. Possibly, we could get to the point where we can predict how our use of Twitter will affect our blog.
  2. A much more far-reaching idea was presented by Ken Winnick to me, via Twitter, and has created an on-going Twitter conversation and hashtag, #BPgulfDB. Let's take crowd sourced, government, and other publicly available data about the recent oilspill in the Gulf of Mexico, and analyze it.
  3. Another idea is to take historical home utility usage plus current smart meter usage data, and create a real-time dashboard, and even predictives, for reducing and managing energy usage.
  4. We also have the opportunity of using public data to enhance reporting and analytics for small, rural and research hospitals.

OSS DSS Formalization

The next step in our open source solutions (OSS) for decision support systems (DSS) study guide (SG), according to the syllabus, is to make our first decision: a formal definition of "Decision Support System". Next, and soon, will be a post listing the technologies that will contribute to our studies.

The first stop in looking for a definition of anything today, is Wikipedia. And indeed, Wikipedia does have a nice article on DSS. One of the things that I find most informative about Wikipedia articles, is the "Talk" page for an article. The DSS discussion is rather mild though, no ongoing debate as can be found on some other talk pages, such as the discussion about Business Intelligence. The talk pages also change more often, and provide insight into the thoughts that go into the main article.

And of course, the second stop is a Google search for Decision Support System; a search on DSS is not nearly as fruitful for our purposes. :)

Once upon a time, we might have gone to a library and thumbed through the card catalog to find some books on Decision Support Systems. A more popular approach today would be to search Amazon for Decision Support books. There are several books in my library that you might find interesting for different reasons:

  1. Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL by Roland Bouman & Jos van Dongen provides a very good overview of data warehousing, business intelligence and data mining, all key components to a DSS, and does so within the context of the open source Pentaho suite
  2. Smart Enough Systems: How to Deliver Competitive Advantage by Automating Hidden Decisions by James Taylor & Neil Raden introduces business concepts for truly managing information and using decision support systems, as well as being a primer on data warehousing and business intelligence, but goes beyond this by automating the data flow and decision making processes
  3. Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications by Larissa T. Moss & Shaku Atre takes a business, program and project management approach to implementing DSS within a company, introducing fundamental concepts in a clear, though simplistic level
  4. Competing on Analytics: The New Science of Winning by Thomas H. Davenport & Jeanne G. Harris in many ways goes into the next generation of decision support by showing how data, statistical and quantitative analysis within a context specific processes, gives businesses a strong lead over their competition, albeit, it does so at a very simplistic, formulaic level

These books range from being technology focused to being general business books, but they all provide insight into how various components of DSS fit into a business, and different approaches to implementing them. None of them actually provide a complete DSS, and only the first focuses on OSS. If you followed the Amazon search link given previously, you might also have noticed that there are books that show Excel as a DSS, and there is a preponderance of books that focus on the biomedical/pharmaceutical/healthcare industry. Another focus area is in using geographic information systems (actually one of the first uses for multi-dimensional databases) for decision support. There are several books in this search that look good, but haven't made it into my library as yet. I would love to hear your recommendations (perhaps in the comments).

From all of this, and our experiences in implementing various DW, BI and DSS programs, I'm going to give a definition of DSS. From a previous post in this DSS SG, we have the following:

A DSS is a set of processes and technology that help an individual to make a better decision than they could without the DSS.
-- Questions and Commonality

As we stated, this is vague and generic. Now that we've done some reading, let's see if we can do better.

A DSS assists an individual in reaching the best possible conclusion, resolution or course of action in stand-alone, iterative or interdependent situations, by using historical and current structured and unstructured data, collaboration with colleagues, and personal knowledge to predict the outcome or infer the consequences.

I like that definition, but your comments will help to refine it.

Note that we make no mention of specific processes, nor any technology whatsoever. It reflects my bias that decisions are made by individuals not groups (electoral systems not withstanding). To be true to our "TeleInterActive Lifestyle" &#59;) I should point out that the DSS must be available when and where the individual needs to make the decision.

Any comments?

Pentaho Reporting Review

As promised in my post, "Pentaho Reporting 3.5 for Java Developers First Look", I've taken the time to thoroughly grok Pentaho Reporting 3.5 for Java Developers by Will Gorman [direct link to Packt Publishing][Buy the book from Amazon]. I've read the book, cover-to-cover, and gone through the [non-Java] exercises. As I said in my first look at this book, it contains nuggets of wisdom and practicalities drawn from deep insider knowledge. This book does best serve its target audience, Java developers with a need to incorporate reporting into their applications. But it is also useful for report developers who wish to know more about Pentaho, and Pentaho users who wish to make their use of Pentaho easier and the resulting reporting experience richer.

The first three chapters provide a very good introduction to Pentaho Reporting and its relationship to the Pentaho BI Suite and the company Pentaho, historical, technical and practical. These three chapters are also the ones that have clearly marked sections for Java specific information and exercises. By the end of Chapter Three, you'll have installed Pentaho Report Designer, and built several rich reports. If you're a Java developer, you'll have had the opportunity to incorporate these reports into both Tomcat J2EE or Swing web applications. You'll have been introduced to the rich reporting capabilities of Pentaho, accessing data sources, the underlying Java libraries, and the various output options that include PDF, Excel, CSV, RTF, XML and plain text.

Chapters 4 through 8 is all about the WYSIWYG Pentaho Report Designer, the pixel-level control that it gives you over the layout of your reports, and the many wonderful capabilities provided by Pentaho Reporting from a wide range of chart types to embedding numeric and text functions, to cross-tabs and sub-reports. Other than Chapter 5, these chapters are as useful for a business user creating their own reports, as it is for a report developer. Chapter 5 is a very deep dive, very technical look at incorporating various data sources. The two areas that really stand out are the charts (Chapter 6) and functions (Chapter 7).

There are a baker's dozen types of charts covered, with an example for each type. Some of the more exotic are Waterfall, Bar-Line, Radar and Extended XY Series charts.

There are hundreds of parameters, functions and expressions that can be used in Pentaho Reports, and Will covers them all. The formula capability of Pentaho Reporting follows the OpenFormula standard, similar to the support for formulæ in Microsoft Excel, and the same as that followed by One can provide computed text or numeric values within Pentaho reports to a fairly complex extent. Chapter 7 provides a great introduction to using this feature.

Chapters 9 through 11 are very much for the software developer, covering the development of Interactive Reports in Swing and HTML, the use of Pentaho's APIs and extension of Pentaho Reporting capabilities. It's all interesting stuff, that really explains the technology of Pentaho Reporting, but there's little here that is of use to the business user or non-Java report developer.

The first part of Chapter 12, on the other hand, is of little use to the Java developer, as it shows how to take reports created in Pentaho Report Designer and publish them through the Pentaho BI-Server, including formats suitable to mobile devices, such as the iPhone. The latter part of Chapter 12 goes into the use of metadata, and is useful both for the report developer and the Java developer.

So, as I said in my first look, the majority of the book is useful even if you're not a Java developer who needs to incorporate sophisticated reports into your application. That being said, Will Gorman does an excellent job in explaining Pentaho Reporting, and making it very useful for business users, report designers, report developers and, his target audience, Java developers. I heartily recommend that you buy this book. [Amazon link]

Pentaho Reporting 3.5 for Java Developers First Look

I was approached by Richard Dias of Packt Publishing to review "Pentaho Reporting 3.5 for Java Developers" written by Will Gorman. (Link is to

Richard Dias has indicated you are a Friend:

Hi Joseph,

My name is Richard Dias and I work for Packt Publishing which specializes in publishing focused IT related books.

I was wondering if you would be interesteed in reviewing the book "Pentaho Reporting for Java Developers" written by Will Gorman.

- Richard Dias

After some back and forth, I decided to accept the book in exchange for my review.

Hi Joseph,

Thanks for the reply and interest in reviewing the book. I have just placed an order for a copy of the book and it should arrive at your place within 10 days. Please do let me know when you receive it.

I have also created a unique link for you. It is Please feel free to use this link in your book review.

In the meanwhile, if you could mention about the book on your blog and tweet about the book, it would be highly appreciated. Please do let me know if it is fine with you.

I’m also sending you the link of an extracted chapter from the book (Chapter 6 Including Charts and Graphics in Reports). It would be great if you could put up the link on your blog. This would act as first hand information for your readers and they will also be able to download the file.

Any queries or suggestions are always welcome.

I look forward to your reply.

Best Regards,


Richard Dias
Marketing Research Executive | Packt Publishing |

Shortly thereafter, I received notification that the book had shipped. It arrived within two weeks.

Of course, I've been too busy to do more than skim through the book. Anyone who follows me as JAdP on Twitter knows that in the past few weeks, I've been:

  • helping customers with algorithm development and implementing Pentaho on LucidDB,
  • working with Nicholas Goodman with his planning for commercial support of LucidDB through Dynamo Business Intelligence, and roadmaps for DynamoDB packages built on LucidDB's plugin architecture, and
  • migrating our RHEL host at ServerBeach from our old machine to a new one, while dealing with issues brought about by ServerBeach migrating to Peer1's tools.

None of which has left any time for a thorough review of "Pentaho Reporting for Java Developers".

I hope to have a full review up shortly after the holidays, which for me runs from Solstice to Epiphany, and maybe into the following weekend.

First, a little background. Will Gorman, the author, works for Pentaho, in software engineering, as a team lead, and works primarily on Pentaho Reporting products, a combination of server-side (Pentaho BI-Server), Desktop (MacOSX, Linux and Windows platforms) and Web-based software (Reporting Engine, Report Designer, Report Design Wizard and Pentaho Ad Hoc Reporting), which stems from the open source JFreeReport and JFreeChart. While I don't know Will personally, I do know quite a few individuals at Pentaho, and in the Pentaho community. I very much endorse their philosophy towards open source, and the way they've treated the open source projects and communities that they've integrated into their Pentaho Business Intelligence Suite. I do follow Will on Twitter, and on the IRC Freednode Channel, ##pentaho.

I myself am not a Java Developer, so at first I was not attracted to a book with a title that seemed geared to Pentaho Developers. Having skimmed through the book, I think that the title was poorly chosen. (Sorry Richard). I find that I can read through the book without stumbling, and that there is plenty of good intelligence that will help me better server and instruct my customers through the use of Pentaho Report Designer.

My initial impressions are good. The content seems full of golden nuggets of "how-tos" and background information not commonly known among the Pentaho community. Will's knowledge of Pentaho Reporting and how it fits into the rest of the Pentaho tools, such as KETTLE (Pentaho Data Integration) and Mondrian (Pentaho Analysis), along with a clear writing style makes all aspects of Pentaho more accessible to the BI practitioner, as well as those that wish to embed Pentaho Reporting into their own application.

This book is not just for Java developers, but for anyone who wishes to extend their abilities in BI, Reporting and Analysis, with Pentaho as an excellent example.

I'll be following up with the really exciting finds as I wend my way through Will's gold mine of knowledge, and, will do my best to fulfill my promise of a full review by mid-January.

You can also click through the Chapter 6 (a PDF) as mentioned in Richard's email.

Thank you, Richard. And most especially, thank you, Will.

Hibernate Tools for Eclipse 3.1

I'm currently on a gig at a very interesting SaaS company. We're introducing and creating agile methods, creating a new SOA with MDM and recreating the applications in the new architecture. One snag that we hit is that the company is using WebLogic 9.2 and the most recent Hibernate Tools won't work with the WebLogic WorkShop, which is based on Eclipse 3.1.

Can anyone point me to an archive where I can find Hibernate Tools for Eclipse 3.1? The team can't find it anywhere. What's up Hibernate? Use the latest or forget it? XX(

On a related note, I want to give Kudos and a huge hoozah to Martin Ying, Principal Consultant with BEA Systems, Inc. He's accomplished an incredible amount in the four days he's been here. I heartily recommend Martin to anyone wanting to get started with developing in WebLogic Workshop. He's amazing. Thank you, Martin.

Cotinuous Process and Code Improvement

We're constantly recreating our 6D™ project management methodology. It started with combining Clarise's software development and project management experience with my aerospace system engineering and program management experience to adopt strict project controls to modern business needs for responsive software development and system integration processes working through distributed personnel. Well, here's a quick thought... software development and deployment should move away from traditional release cycle concepts to one of continuous process/code improvement within SaaS and virtual appliance environments. No code is alpha nor beta nor production, but a continuum of changes and adaptations responding to fluctuating business needs; done within a well managed environment to prevent security errors, poor performance, "garbage out" and junk code. So as we're assuring that our 6D™ [six dimensions of a project] is in accord with the PMBoK, we'll be keeping this thought in mind as well, and let's think beyond Extreme and Agile programming and continuous process improvement for software quality.

Pentaho SQR for Bugzilla

Today, Pentaho launched their new open source project, Software Quality Reports (SQR) for Bugzilla. Bugzilla is a "Defect Tracking System" or "Bug-Tracking System" that allows developers to keep track of outstanding bugs in their product. We interviewed Lance Walter and Nicholas Goodman of Pentaho for this podcast to get the inside story about this new project.

Software Quality Reports for Bugzilla is the first in a series of focused solutions that Pentaho will be bringing forth in 2006 onward. SQR for Bugzilla is based upon the Pentaho Open BI Suite to provide enhanced reporting and analysis of data from Bugzilla.

Lance pointed out that one reason the first Pentaho solution was for Bugzilla is that Bugzilla has quickly achieved worldwide popularity for its rich functionality, and is used by organizations ranging from open source leaders like The Apache Project, Novell, Open Office, and Red Hat to public- and private-sector organizations including NASA, AT&T, Citigroup, GlaxoSmithKline, France Telecom, Rutgers University,, Siemens and more.

Nick provides a wealth of details on the uses and goals of the SQR for Bugzilla project. Bugzilla is used as a source system for the Pentaho solution, though other source systems, such as configuration management and version control, can be added. The Pentaho SQR for Bugzilla adds reports and analytical slice & dice which allow project managers, end users and developers to answer questions about bug resolution or bug burn that can not be easily answered using only Bugzilla reports. Pentaho SQR for Bugzilla is a separate project, licensed under the open source Mozilla license 1.1, and can be downloaded from Pentaho or Sourceforge.

Screen shots can be found at Sourceforge.

The Pentaho SQR for Bugzilla podcast
is approximately 40 minutes in length and 37 MB in size.

Update: You can read the full press release from Pentaho, New Open Source Project Harnesses World’s Most Popular Open Source BI Suite to Enhance Bugzilla with Reporting and Analysis.

April 2018
Mon Tue Wed Thu Fri Sat Sun
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
 << <   > >>

At the beginning, The Open Source Solutions Blog was a companion to the Open Source Solutions for Business Intelligence Research Project, and book. But back in 2005, we couldn't find a publisher. As Apache Hadoop and its family of open source projects proliferated, and in many ways, took over the OSS data management and analytics world, our interests became more focused on streaming data management and analytics for IoT, the architecture for people, processes and technology required to bring value from the IoT through Sensor Analytics Ecosystems, and the maturity model organizations will need to follow to achieve SAEIoT success. OSS is very important in this world too, for DMA, API and community development.

37.652951177164 -122.490877706959


  XML Feeds


Our current thinking on sensor analytics ecosystems (SAE) bringing together critical solution spaces best addressed by Internet of Things (IoT) and advances in Data Management and Analytics (DMA) is here.

Recent Posts

powered by b2evolution free blog software