OSS Public:Research Shelf/OSBI Daily Archive

From WikiOpenSourceSolutions

Open Source Solutions: Contents | Preface | Fundamentals | Open Source BI Options | Building a Data Warehouse

OSDW Main Pages Main PageOSBI Book Proposal, Preparation, Research, Open Source Business Intelligence, the Book, Public Area for Open Source Solutions s (http://press.teleinteractive.net/open/mw/index.php?title=Special%3ASearch&ns12=1&&fulltext=Search&search=x) e0 (http://press.teleinteractive.net/open/mw/index.php?title=OSS_Public:Research_Shelf/OSBI_Daily_Archive&action=edit&section=0) +/- (http://press.teleinteractive.net/open/mw/index.php?title=Template:H:hns&action=edit) </div>


This lists all the past entries that have appeared in the OSBI Daily (http://www.squidoo.com/osbi/#module1316436). Daily tip, definition, quote, concept and other information on DW, BI or Open Source is provided in OSBI Daily through the OSBI Squidoo lens.


Image:osbiarchives_small.jpg


July 2007

2007.07.04
Happy Independence Day in the USA, and we hope your enjoying the 4th day of July wherever you are. We hope that you have clear skies, and great fireworks. For the first time that I can remember on the Coast, we have clear skies today. Usually, we just watch the fog turn pink, the fog turn green, and the fog gets bright, and then the smoke from the fireworks mingles with the fog and all goes grey.  ;-)


2007.07.01
Pentaho has announced that the Université of Montréal has picked Pentaho after an extensive bake-off among many other offerings including Business Objects (already owned it), Oracle BI (already owned it), and SAS (considered it); Pentaho also that they've replaced Crystal Reports at Boyne Resorts.

June 2007

2007.06.30
The server has been restored.

2007.06.25
The server is down.

May 2007


2007.05.09
SpagoBI 2.2.0 with some spanking new features (http://forge.objectweb.org/project/shownotes.php?release_id=1934) is available for download (http://forge.objectweb.org/project/showfiles.php?group_id=195&release_id=1934). Mule 1.4 has been released with new BPM (http://mule.codehaus.org/display/MULE/Release+Notes) support features. JasperSoft wins the 2007 Duke's Choice Award at JavaOne (http://www.jasperforge.org/index.php?option=com_content&task=view&id=312&Itemid=2) Developer Conference with James Gosling name the JasperReports Open Source project among Coolest products in the World for Java Technology.

2007.05.05-08
We're still being a bit lazy and not updating the OSBI Daily as we should.

2007.05.04
The Open Source ThinkTank 2007 (http://thinktank.olliancegroup.com/) was held in March by the Olliance Group and DLA Piper, who have available for immediate download a PDF of their 14-page Executive Summary Report (http://thinktank.olliancegroup.com/ostt2007report.pdf). The Executive Summary is very informative, showing increased acceptance of Open Source Solutions, continued barriers to adoption, and impact upon proprietary software vendors.

2007.04.27-05.03
Just didn't find anything that was all that interesting, and didn't feel like making something up.

April 2007


2007.04.27-05.03
Just didn't find anything that was all that interesting, and didn't feel like making something up.

2007.04.24
Here's the feeds you need to find the goodies from MySQL2007 PlanetMySQL Feed (http://www.planetmysql.org/rss20.xml), MySQL from Technorati (http://feeds.technorati.com/feed/posts/tag/MySQL+Conference), MySQL on IceRocket (http://www.icerocket.com/search?tab=blog&q=MySQL&rss=1), and MySQL on Google (http://blogsearch.google.com/blogsearch_feeds?hl=en&q=MySQL+Conference&ie=utf-8&num=10&output=rss).

2007.04.23
We've put together what we hope is a comprehensive guide to open source BI and data warehousing at the 2007 MySQL (http://press.teleinteractive.net/oss/2007/04/23/bi_and_dw_at_mysql_2007) Conference and Exposition.

2007.04.21-22
Rain on Saturday but sunny on Sunday, as it should be  ;-)

2007.04.20
JasperSoft will have a booth and be presenting on three topics (http://press.teleinteractive.net/oss/2007/04/21/jaspersoft_at_mysql2007) at the MySQL Conference & Expo 2007.

2007.04.19
An integrated reporting solution is coming to OpenOffice.org as reported (http://www.pentaho.org/news/releases/20070419_pentaho_adds_reporting_solution_to_openoffice.php) by Pentaho and Sun Microsystems today. The jFreeReport open source report project was brought into Pentaho in late 2005, and has been enhanced with "drag-and-drop report designer, MDX support, and integration with the Pentaho platform for report scheduling, security, and portal integration[;] Pentaho's recently-announced standards-based metadata layer and AJAX-based ad hoc query tool will also support and extend Pentaho Reporting's capabilities" allowing OpenOffice.org users "to create reports with content from the OpenOffice.org Base database as well as a wide range of proprietary and open source relational databases, OLAP and XML sources".

2007.04.18
Mondrian 2.3 is available (http://sourceforge.net/project/showfiles.php?group_id=35302&package_id=27786&release_id=496204) for download.

2007.04.17
JasperSoft has announced (http://www.jaspersoft.com/nw_press_jaspersoft_30000_jasperforge.html) that JasperForge.org (http://jasperforge.org/) has passed 30,000 registered members, with over 130 projects that "range from developer tools to integration toolkits to localization powered by an interactive online translation tool, JasperBabylon, where more than 30 different translation projects are underway. Some of the more popular JasperForge.org projects include the Drupal Bridge for JasperReports, iReport with Hibernate, Oracle Stored Procedures, Reporting and OLAP Analysis Tools for Bugzilla, and Jasper4Eclipse, an iReport plug-in project for eclipse".

2007.04.16
The Web Services Business Process Execution Language (WS-BPEL (http://www.oasis-open.org/committees/wsbpel/)) version 2.0 has been approved as an OASIS Standard (http://www.oasis-open.org/news/oasis-news-2007-04-12.php), signifying the highest level of ratification by the Organization for the Advancement of Structured Information Standards (OASIS (http://www.oasis-open.org/)).

2007.04.13
I (Joseph) spent the afternoon yesterday at the JasperSoft HQ (http://press.teleinteractive.net/oss/2007/04/13/an_afternoon_at_jaspersoft).

2007.04.03-12
No OSBI Daily these days. We've been hard at work updating our information and researching the OSBI projects that we've been tracking as well as looking for new ones. We'll be updating the lens over the next few weeks.

2007.04.03
Matt Casters, founder of the open source KETTLE ETL/EAI project will be presenting at the MySQL 2007 Conference; he has more detail and another invitation to the Pentaho BoaF (http://www.ibridge.be/?p=39) on his blog.

2007.04.02
Talend has announced the opening of their U.S.A. Headquarters, as well as several new executive hires (http://www.talend.com/press/opening-us.htm) in Palo Alto, CA. Clarise's Top 10 "BI on a Budget," (http://www.campustechnology.com/articles/46571_1/) wraps up Campus Technology, 4/1/2007 issue.

2007.03.31-04.01
It was foggy on the Coast this weekend, inspiring me to make lasagna. I hope all had great April's Foolery.


March 2007


2007.03.31-04.01
It was foggy on the Coast this weekend, inspiring me to make lasagna. I hope all had great April's Foolery.

2007.03.30
MuleCon2007 was a great success. Mule has a summary page up (http://mule.codehaus.org/display/MULE/MuleCon+2007,+San+Francisco) with links to our blogging (http://press.teleinteractive.net/oss?s=MuleCon2007&sentence=AND&submit=Search), the MuleSource blog recaps (http://blogs.mulesource.com/), and PDFs of all the presentations and training available for download. Whether you were there or not, this is a great resource for learning about Mule (http://mule.codehaus.org/display/MULE/MuleCon+2007,+San+Francisco) and possibilities opened up with ESB.

2007.03.29
Nicholas Goodman will be conducting Operational BI Training with Pentaho in the San Francisco Bay Area, April 23 -26; module 1, "Introduction to Design Studio" & module 2, "Pentaho Report Development and Deployment."; register via his blog, and state so in the comments (http://www.nicholasgoodman.com/bt/blog/2007/03/26/meet-me-in-san-francisco-pentaho-training/), and get some extra Pentaho schwag.

2007.03.28
Julian Hyde, founder of Mondrian OLAP, will be speaking at the 2007 MySQL user conference, and hosting a Mondrian BoaF. He's asking (http://julianhyde.blogspot.com/2007/03/mondrian-and-me-at-mysql.html) for topics of interest.

2007.03.27-28
MuleSource is hosting a free [as in beer] two-day MuleCon (http://mule.mulesource.org/wiki/display/MULE/MuleCon+2007%2C+San+Francisco) in San Francisco on March 27 & 28; day 1 focuses on introductory and mid-level training while day 2 emphasizes Mule 2.0 roadmap updates to the community. We'll be blogging the Mule training (http://press.teleinteractive.net/oss?s=MuleCon2007&sentence=AND&submit=Search) from there.

2007.03.26
The Apache wiki has two links to articles regarding the relationship between Service Component Architecture and Java Business Integration (http://cwiki.apache.org/confluence/pages/viewblogposts.action?key=SM&postingDate=2007%2F3%2F22&period=5) for your SOA ESB. phpPgAdmin v4.1.1 is available (http://phppgadmin.sourceforge.net/) for download.

2007.03.25
The Campus Technology 2007 schedule (http://campustechnology.com/conference/index.aspx) is online; we have more information on OSS and BI related sessions (http://press.teleinteractive.net/oss/2007/03/25/campus_technology_2007_schedule) on our blog.

2007.03.24
A debate that's be going on for a few years was raised again, with an OSBI slant, by Matt Asay in his post "Does first winner take all in open source? (http://asay.blogspot.com/2007/03/does-first-winner-take-all-in-open.html)".

2007.03.23
Check out the Open Source Community (http://opensourcecommunity.org/); it's only been established recently as "a place for those of us interested in open source solutions and community issues to come and talk" - so go there and join in. We're attending the OSCMS2007 Summit at Yahoo! yesterday and today, and blogging (http://press.teleinteractive.net/tia_life?s=OSCMS2007&sentence=AND&submit=Search) it.

2007.03.22
The editors of Dr. Dobb's Journal announced (http://www.prnewswire.com/cgi-bin/stories.pl?ACCT=104&STORY=/www/story/03-22-2007/0004551328&EDATE=) the winners of the 17th annual Jolt Product Excellence and Productivity Awards. Pentaho is one of the Productivity Award winners in the Enterprise Tools category. We're attending the OSCMS2007 Summit at Yahoo! today and tomorrow, and blogging (http://press.teleinteractive.net/tia_life?s=OSCMS2007&sentence=AND&submit=Search) it.

2007.03.21
MuleSource (http://www.mulesource.com) has added three key new hires (http://www.mulesource.com/company/press_releases/Newhires_032107.php) with deep SOA and ESB backgrounds: Mike Lewis (VP of Sales), Daniel Brum (Technical Director) and Damian Raffell (VP of EMEA Sales and Operations). Pentaho (http://www.pentaho.org/)'s recent sales announcements, to iStockphoto (http://www.pentaho.org/news/releases/20070313_istockphoto_deploys_pentaho_for_reporting_and_data_integration.php) and ZipReality (http://www.pentaho.org/news/releases/20070321_ziprealty_deploys_pentaho_for_etl.php), show how OSBI components are being used to supplement and enhance existing data warehousing implementations based on proprietary and home-grown systems.

2007.03.20
TheServerSide Java Symposium (http://javasymposium.techtarget.com/lasvegas/index.html) starts tomorrow in Las Vegas, NV, USA. The next version of the open source license is dividing the community, and it isn't even out yet says Charles Babcock in the March 17, 2007 (From the March 19, 2007 issue) posting for InformationWeek The Controversy over GPLv3 (http://www.informationweek.com/story/showArticle.jhtml?articleID=198001444). Ian Skerrett has some EclipseCon 2007 Demographics (http://ianskerrett.wordpress.com/2007/03/16/eclipsecon-2007-demographics/).

2007.03.19
Steven J. Vaughan-Nichols discusses Microsoft's announced plans to release the source for the FoxPro (http://www.eweek.com/article2/0,1759,2105307,00.asp?kc=EWRSS03129TX1K0000616) database & programming language.

2007.03.17-18
Happy St. Patrick's Day weekend for March 17 & 18. Enjoy my Corned Beef & Cabbage (http://press.teleinteractive.net/cynasuralog/2007/03/17/corned_beef_and_cabbage_buono_sanctus_pa) and even leftovers (http://press.teleinteractive.net/cynasuralog/2007/03/18/leftover_corned_beef_and_cabbage_recipes). Éirinn go Brách.

2007.03.16
No Daily for March 15th as the Squidoo editor just blanked out on us. But here's some things that caught our eyes yesterday and today. The Open Source Software in Education Series  (http://www.wikieducator.org/Open_Source_Software_in_Education_Series_on_Terra_Incognita) on Terra Incognita wiki lists upcoming articles. Some bloggers who attended the Oliance Opens Source Think Tank were posting their thoughts: Gianugo Rabellino  (http://www.rabellino.it/blog/2007/03/15/thoughts-from-the-tank/), Fabrizio Capobianco  (http://www.funambol.com/blog/capo/2007/03/what-difference-year-makes.html) and Don Klaiss  (http://blog.compiere.com/2007/03/will_compiere_r.html).

2007.03.14
BIRT has made PDFs of their EclipseCon2007 presentations available (http://www.eclipse.org/birt/phoenix/presos/) for download. Today is your last chance to save with early bird registration (http://mysqlconf.com/) for the MySQL Conference.

2007.03.13
EnterpriseDB announced the availability of their new RemoteDBA (http://www.enterprisedb.com/news_events/press_releases/13_03_07.do) service for EnterpriseDB and PostrgeSQL; the RemoteDBA service includes (http://www.enterprisedb.com/support/remotedba.do), among other things, application evaluation, migration, deployment and monitoring. SpagoBI will be presenting (http://spagobi.eng.it/ecm/faces/public/guest/home/highlighted?portal:componentId=highlighted-detail&portal:type=action&portal:isSecure=false&uicomponent=RemoteDetail&op=viewDocument&objectId=production:/cms/publications/news-events/events/EngineeringAndSpagoBIatPAAL2007) at PAAL 2007 (http://www.paal2007.it/) (Open and Free Public Administration - Opportunity, Critical Aspects and Experiences in the Application of Free Software and Open Standards in the Italian Public Administration) at Pula (Sardinia, Italy), March 15-16, 2007.

2007.03.12
MySQL GUI Tools Bundle for 5.0 is available for download  (http://dev.mysql.com/downloads/gui-tools/5.0.html) under the MySQL AB "dual licensing" model. And speaking of MySQL, there are only a few days left for early-bird discounting on registration  (http://www.mysqlconf.com/pub/w/54/register.html) for the 2007 MySQL Conference & Expo. BIRT 2.2 Milestone 5 has been released  (http://www.eclipse.org/birt/phoenix/project/notable2.2M5.php) to the public. We missed it back in February, but Talend has a partnership  (http://spagobi.eng.it/ecm/faces/public/guest/home/press-release?portal:componentId=press-release-detail&portal:type=action&portal:isSecure=false&uicomponent=RemoteDetail&op=viewDocument&objectId=production:/cms/publications/news-events/press-release/TalendOpenStudioandSpagoBIPartnership) with SpagoBI bringing ETL to the SpagoBI suite, as Talend has done with JasperSoft.

2007.03.09
Our Internet connectivity went down early this week, ending today, and with it went our ability to update the OSBI Daily. So, we had a week off, and we hope that you'll rejoin us on Monday, March 12.

2007.03.03-04
Happy weekend. Saturday was incredibly sunny and warm Coastside. Sunday had a few clouds, and not quite as warm, but was a great day nonetheless.

2007.03.02
Firebird 2.0.1 Release Candidate 2 kits (http://www.firebirdsql.org/index.php?op=files&id=fb201_rc2) for Windows and Linux are available for testing. MuleSource is hosting a free [as in beer] two-day MuleCon (http://mule.mulesource.org/wiki/display/MULE/MuleCon+2007%2C+San+Francisco) in San Francisco on March 27 & 28; day 1 focuses on training while day 2 emphasizes Mule 2.0 community updates.

2007.03.01
Mark Madsen tells us (http://clickstream.blogspot.com/2007/02/full-day-conference-session-on-open.html) that The Data Warehousing Institute is planning a full day conference session for May. Open Source and commercial SaaS are one step closer to integration today with MuleSource announcing a connector (http://www.mulesource.com/company/press_releases/Salesforce_022807.php) for the Mule ESB into the online CRM/SFA salesforce.com. And while not related to open source, Oracle has announced that it has agreed to buy Hyperion Solutions (http://www.oracle.com/corporate/press/2007_mar/hyperion.html) Corporation (NASDAQ: HYSL) through a cash tender offer for $52.00 per share, or approximately $3.3 billion.


February 2007


2007.02.28
In collaboration with rPath, Ingres has announced the general availability of Icebreaker (http://www.ingres.com/press/2007-02-27_Icebreaker.php), a virtualized software appliance RDBMS making the operating system transparent. This validates our opinion of the true value of vitualization (http://press.teleinteractive.net/cynasuralog/2006/12/15/three_months_with_a_mac#virtualization), beyond running Windows on MacOSX or Linux, or Linux on Windows or MacOSX. In other DW Appliance news, see "Data warehousing: the appliance of science (http://www.cbronline.com/article_cbr.asp?guid=9104551D-56C1-4EE7-BDF9-BD219E8685BF)" 27th February 2007 by Madan Sheina. It's an interesting read, though we don't necessarily agree with everything in it, such as the view that the start-up's open source based DW Appliances are only suited for data marts, given that some scale to petabyte levels of storage.

2007.02.27
If you're a fan of the PALO (http://www.palo.net/) open source MOLAP for spreadsheets solution, you'll be happy to know that the final release of version 1.5 is available (http://www.palo.net/index.php?show=66bec900afd48a1ea44b4c2ebfda490e), as well as a revised roadmap (http://www.palo.net/index.php?show=2ba716b69dabc5f1e38eda6c22ba6a04) for version 2. In other news, JasperSoft and MySQL have announced the availability of Jasper for MySQL: OEM Edition (http://www.jaspersoft.com/nw_press_jaspersoft_jasper_for_mysql.html), a suite of operational reporting products that includes a high-performance interactive report server, a graphical report creation tool, and a pixel-perfect reporting system with dashboards, tables, crosstabs and charts. And speaking of MySQL, Dave Dargo has the most cogent view on MySQL as a "real" RDBMS (http://davedargo.blogspot.com/2007/02/mysql-real-database-or-not.html) that we've yet seen.

2007.02.26
BI To Build or Buy or Open Source (http://press.teleinteractive.net/oss/2007/02/26/bi_to_build_or_buy_or_open_source) is a post we wrote today in our Open Source Solutions blog that will help us prepare, through your comments, for our upcoming presentation at Campus Technology 2007, and hopefully help our readers with an outline of the strategic options available today in creating a BI solution.

2007.02.24-25
We hope that you had a great weekend. Here on the San Mateo County Coastside we had some light rain, and a wee bit of sun on Saturday, with heavy rain, and even hail, on Sunday.

2007.02.23
DATAllegro (http://www.datallegro.com/) made a series of announcements on 2007 February 20th, including the release of v3 (http://www.datallegro.com/pr/2_20_07_datallegro_v3.asp) of their data warehouse appliance, strengthening of the relationship with Ingres (http://www.datallegro.com/pr/2_20_07_datallegro_ingres.asp), use of EMC's CLARiiON (http://www.datallegro.com/pr/2_20_07_datallegro_emc.asp), and use of Cisco's InfiniBand Switches (http://www.datallegro.com/pr/2_20_07_datallegro_cisco.asp).

2007.02.22
We've been away for a few days, so here's a list of links to make up for our lackadaisical ways.

  • Mulesource (http://www.mulesource.com/) announces GetMule (http://www.getmule.com/), a minisite with convenient links to downloading the Mule ESB (http://www.mulesource.com/download/), getting the RedMonk white paper (http://www.getmule.com/reportform.php) on ESB, complete a survey (http://www.surveymonkey.com/s.asp?u=625633298361) - get a T-shirt, and watch Ross on Flash (http://www.getmule.com/mule_overview_flash1.php)
  • Pentaho on Web2.0-roids (http://www.pentaho.org/news/releases/20070222_pentaho_announces_general_availability_of_pentaho_ajax.php)
  • Pentaho at MySQL (http://www.pentaho.org/news/releases/20070220_pentaho_headlines_data_warehousing_and_bi_topics_at_mysql_2007_user_conference.php)
  • Get a PDF (http://www.sun.com/emrkt/srsc/postgresql/c_06-1081_SunOmniTl_SB-Fhr.pdf) from Sun Microsystems on a PostgreSQL DW success story
  • Julian writes (http://julianhyde.blogspot.com/2007/02/mondrian-at-fisl-8th-international-free.html) about Mauro Pichiliani's upcoming Mondrian talk at the 8th International Free Software Forum (FISL)

Enjoy the news, and thanks to the 72 folk who find us to be del.icio.us (http://del.icio.us/post?url=http%3A%2F%2Fwww.squidoo.com%2Fosbi%2F&title=Open%20Source%20Business%20Intelligence%20on%20Squidoo&jump=no&partner=delbg).

2007.02.20-21
No news is good news. :-D

2007.02.19
Happy President's Day in the USA.

2007.02.17-18
Have a good weekend.

2007.02.16
ServiceMix 3.1 (http://incubator.apache.org/servicemix/servicemix-31.html) was made available on 2007 February 12. This release includes a bunch of new JBI components, and a lot of new features, improvements and bug fixes.

2007.02.15
Bennett Voyles, "Data Mining & Business Intelligence >> Open for Business (http://www.campustechnology.com/article.asp?id=20093)," from Campus Technology, 2007.02.01, is up on the web. Ben interviewed me (http://press.teleinteractive.net/oss/2006/11/29/osbi_for_campus_technology_magazine) for this article. Look for quotes and comments from "di Paolantonio" on pages 3,4 & 5 of the web article (http://www.campustechnology.com/article.asp?id=20093).

2007.02.14
The Open Solutions Alliance (http://www.opensolutionsalliance.org/) was formally announced on 2007 February 13 at the Linuxworld Open Solutions Summit being held in New York City, NY, USA. In their own words, "The Open Solutions Alliance will seek to expand the market for business open source software solutions through cooperative action that increases awareness of member solutions, reduces barriers to customer adoption, facilitates interoperability, and explains the benefits of open source solutions.". Among the founding members are OSBI solution vendors JasperSoft [BI Suite], Talend [ETL], and EnterpriseDB [RDBMS]. One goal of the organization is to improve interoperability among the members' products. In an example given by Michael Harvey, chief marketing officer of CentricCRM, the work could make it easier for customers using their customer-relations software to use JasperSoft's data-mining software to plumb customer records for particular information.

2007.02.13
Julian Hyde announced planned improvements (http://julianhyde.blogspot.com/2007/02/mondrian-cache-control.html) to Mondrian's cache control to better enable real-time OLAP using Mondrian.

2007.02.12
The third release candidate of version 1.5 of the MOLAP engine Palo was released today (http://www.palo.net/index2.php?show=2aa5aeb28a572f1f1dbb01b1996da740) for evaluation and comment.

2007.02.10 & 11
Enjoy the weekend.

2007.02.09
We are adding Labkey (http://labkey.com/)'s CPAS (https://cpas.fhcrc.org/Project/home/begin.view?), currently at version 1.7, to our list of Data Mining (http://press.teleinteractive.net/index.php/ossLinks?cat=369) open source solutions. CPAS (https://cpas.fhcrc.org/Project/home/about/begin.view) is a secure, scientific bioinformatics & collaboration portal, written in Java, and provided under the Apache License. Thanks to Zack Urlocker (http://www.theopenforce.com/2007/02/labkey_computat.html) for pointing out this software.

2007.02.08
Greenplum announces (http://greenplum.com/news/pressRelease.php?item=20070206b) US$19MM in new financing and a new CEO, former Sun Microsystems' Senior Vice President of US sales, Bill Cook. This caught VentureBeat's notice (http://venturebeat.com/2007/02/05/greenplum-and-the-web-20-server/), bringing attention to the Greenplum/Sun data warehousing appliance (http://press.teleinteractive.net/oss/2006/07/26/commercial_open_source_appliance_by_sun) as the Web2.0 Server. In unrelated news, despite three executives recently leaving (http://www.crn.com/sections/breakingnews/dailyarchives.jhtml?articleId=197003440), Ingres has released a new developer bundle for Elipse (http://ingres.com/press/2007-02-07_Eclipse.php).


2007.02.07
Pentaho announced (http://www.pentaho.org/news/releases/20070206_pentaho_announces_2006_pentaho_partner_awards.php) the winners of the 2006 Pentaho Partner Awards. The awards were presented last week at Pentaho’s Worldwide Partner Summit in Orlando, Florida and include BPM Conseil (http://www.bpm-conseil.com/), OpenBI (http://www.openbi.com/), and Two Heads srl (http://www.pentaho.com/services/partners/partnerProgram.php?partner=50).


2007.02.06
PostgreSQL Security Update  (http://www.postgresql.org/about/news.741) was released to fix the reported vulnerabilities that have been reported in PostgreSQL which may cause Denial of Services.


2007.02.05
Secunia (http://secunia.com/advisories/24033/) released an advisory regarding some vulnerabilities that have been reported in PostgreSQL which may cause Denial of Services.


2007.02.02-04
No announcements.

2007.02.01
Pentaho announced the availability of Pentaho Data Integration 2.4 (http://www.pentaho.org/news/releases/20070201_pentaho_announces_ga_pentaho_data_integration_2_4.php). This latest version adds massively parallel processing (MPP) support as well as additional ease-of-use features to its GUI designer.


January 2007


2007.01.31
JasperSoft Corporation (http://www.jaspersoft.com/) announced the addition of JasperETL™ (http://www.jaspersoft.com/nw_press_jaspersoft_etl_talend.html), a robust Extract, Transform and Load (ETL) product, to the JasperSoft Business Intelligence Suite (JBIS). JasperETL adds data management capabilities to JBIS, which will be made available in Open Source and Professional editions, and was developed through a technology partnership with Talend (http://www.talend.com/). Talend is an open source software provider for data integration tools.

2007.01.30
Mary Grush has invited the authors of this lens (http://press.teleinteractive.net/oss/2007/01/29/campus_technology_2007) to speak at the Campus Technology 2007 (http://campustechnology.com/conferences/summer2007/index.asp) conference to be held in Washington, D.C., USA. We'll be speaking on Wednesday, 2007 August 1 at 11:15am-12:15pm. From our initial conversations with Mary, we'll be discussing, per the flexible placeholder title, "The Economics of BI: Cost-Effective Implementation Strategies/Cost Effective BI/DW Strategy". In other words, providing a framework for implementing data warehousing, data mining, dashboards and other forms of data analytics and business intelligence with an eye towards the cost and economics of the strategy. The framework will delineate management and leadership strategies, and introduce the audience to open source solutions for BI Suites and various related tools. We'll develop this in the OSS Blog (http://press.teleinteractive.net/oss?s=Campus+Technology+2007&sentence=sentence&submit=Search) as much as we can. We would really appreciate any comments there, to help us refine our talk.

2007.01.29
One strategic question that you must ask yourself when planning the architecture for a new data analytics program, is how you are going to integrate the data from all of your various source systems. If you are primarily setting up an enterprise data warehouse for historical reports, analysis and trend spotting, traditional ETL may be the way to go. If you are looking more towards real-time analysis and dashboards, you may want to consider an Enterprise Service Bus (ESB) and a Service Oriented Architecture (SOA). There are some good entries for open source ESB, some using the Java Business Interface (JBI) from Sun Microsystems, and some not doing so. Both camps are worth a look as you decide the mix of ETL, Enterprise Application Integration (EAI) and ESB appropriate to your users' needs.

2007.01.26
Earlier this month, Palo, the MOLAP plug-in for Microsoft Excel, announced plans to support OpenOffice.org Calc (http://www.palo.net/index2.php?show=f111794a6ee9a3922b023dd43f35926d). They've found development and financial support, but more is always better. Check out the Palo announcement (http://www.palo.net/index2.php?show=f111794a6ee9a3922b023dd43f35926d) and see if you can help.

2007.01.25
Nat Torkington is asking "what you think are the three most important open source projects going today" in a post on O'Reilly Radar (http://radar.oreilly.com/archives/2007/01/survey_three_mo.html). He's conducting a survey with a promise to go into details and "the numbers" next week. Go over there and make your opinion heard.

2007.01.24
Several open source projects related to our OSBI coverage, have been nominated Finalists for the 17th Annual Jolt Product Excellence Awards from CMP Technology's Dr. Dobb's Journal (http://www.prnewswire.com/cgi-bin/stories.pl?ACCT=104&STORY=/www/story/01-23-2007/0004510941&EDATE=). Among these are Pentaho (http://www.pentaho.com/), Mule (http://mulesource.com/), dbDeploy (http://dbdeploy.com/) and Liferay (http://www.liferay.com/web/guest/home)

2007.01.23
DecisionStudio (http://decisionstudio.com/home) brings together open source software into a desktop business intelligence solution (http://decisionstudio.com/wiki/doku.php?id=platform_technical_details) using MySQL for an underlying data warehouse, MySQL Adminstrator, MySQL Query Browser, DBDesigner Workbench, the R Project and Tinn-R, iReport and Python.

2007.01.22
The GOLAP project no longer shows on Sourceforge.net or even in Google. We've removed it from our linkblog and link modules.

2007.01.19
The Open Source graphical PostgreSQL administration tool for Windows, Linux, FreeBSD, Mac OS X and Solaris, pgAdmin III v1.6.2 (http://www.pgadmin.org/download/) is available for download.

2007.01.18
EnterpriseDB announced (http://www.enterprisedb.com/news_events/press_releases/17_01_07.do) the immediate availability of technical support packages for PostgreSQL as well as changes to the subscription offerings available for its Oracle-compatible EnterpriseDB Advanced Server relational database management system (RDBMS).

2007.01.17
Talend Open Studio (http://www.talend.com/index.htm) reached over 25,000 downloads in early January, 2007. As of 1007.01.12, Talend Open Studio v1.1.1 became available for download (http://www.talend.com/downloads-talend.htm).


2007.01.16
Mondrian 2.2 (http://sourceforge.net/project/showfiles.php?group_id=35302) is available for download. Read the Release Notes (http://sourceforge.net/project/shownotes.php?group_id=35302&release_id=460253) to see what is included in this release.


2007.01.15
Clover.ETL rel 2.0.3 (http://cloveretl.berlios.de/index.php) was released on 2007 January 10 and is available for download (http://cloveretl.berlios.de/download.php).


2007.01.14
Derby 10.2.2.0 (http://db.apache.org/derby/derby_downloads.html) is available for download.


2007.01.13
EnterpriseDB Advanced Server 8.1 R2 Update 1 (8.1.5.21) (http://www.enterprisedb.com/products/download.do) is available for download.


2007.01.12
Dave Dargo blogs about his 2006 activities for Ingres and announces he is leaving (http://blogs.ingres.com/davedargo/content/2007-01-11.html) his position as Senior Vice President of Strategy and Chief Technology Officer. It seems he'll be doing more consulting as well and has started a new blog (http://davedargo.blogspot.com/), which shows that he'll be playing some golf, as well.


2007.01.09-11
No OSBI Daily today. There were problems editing the lens, and then we were off to MacWorld. From MacWorld we can say that the Apple iPhone looks slick, but isn't the converged device we want (http://press.teleinteractive.net/cynasuralog/2005/06/01/perfect_handheld_not_yet).

2007.01.08
Berkeley DB XML Release 2.3 (http://www.oracle.com/database/berkeley-db/xml/index.html) is now available. This release contains enhancements for Performance Improvements, event API and updated XQuery 1.0 & XPath 2.0.


2007.01.07
PostgreSQL 8.2 is now available for download. (http://www.postgresql.org/ftp/) The release notes (http://www.postgresql.org/docs/current/static/release-8-2.html) lists the user requested performance enhancements and functionality included in this release.


2007.01.06
An initial Developer Preview 1 release of BaseTen, Open Source Mac framework for PostgreSQL databases (http://www.postgresql.org/about/news.710) has been announced. This release is intended for evaluation purposes only.


2007.01.05
BIRT is an open source Eclipse-based reporting system. On December 21, 2006, the Features within BIRT 2.2 Milestone 2 (http://www.eclipse.org/birt/phoenix/project/notable2.2M2.php) was published. Listed features included:

  • BIRT Web Project Wizard
  • Open Data Access (ODA) Project Wizards
  • New Chart Riser Types
  • Flat File Data Source Changes
  • Joint Data Set Improvement


2007.01.04

We were contacted by Stephanie Endlich of Jedox GmbH (http://www.jedox.com/), who suggested a better description for Palo (http://www.palo.net/): "PALO is a memory based Open Source MOLAP which is able to consolidate data hierarchies in real time and which supports write-back of data. It provides a MDDB and a free Microsoft Excel add-in. For flexible integration Palo offers APIs for Java, PHP, C and .NET. From their homepage... 'Palo is an advanced data store for Microsoft Excel that allows you to handle large amounts of Excel data on a small number of worksheets. In addition, it also allows you to share Excel data real-time with your colleagues.'" Let us know what you think in comments to our nearly identical blog post (http://press.teleinteractive.net/oss/2007/01/04/update_to_our_palo_description), and we'll update our description accordingly.

2007.01.03

You can get the OPML file of our [press.teleinteractive.net/oss/2007/01/03/bloggers_blogging_open_source blogroll of bloggers blogging about open source] from our blog. ;-) Feel free to import and use it in your favorite feed reader.


2007.01.02

Palo 1.5 RC1 is available (http://www.palo.net/index2.php?show=3af981cf59d925572d007f4b0cdca7f9)

As of 2006 December 27, the first release candidate of the new Palo 1.5 is available for download (http://www.palo.net/index2.php?show=3). Palo 1.5RC1 offers major improvements based on speed, user rights and new functions.


2007.01.01

Happy New Year. The OSBI Daily will return on 2007 January 2.


December 2006


2006.12.31

Happy New Year. The OSBI Daily will return on 2007 January 2.

2006.12.30

Firebird team released its Roadmap for 2007. (http://www.firebirdsql.org/index.php?op=devel&sub=engine&id=roadmap_2007&nosb=1)

2006.12.29

OpenMFG (http://www.openmfg.com/), LLC announced (http://press.teleinteractive.net/oss/2006/12/29/openrpt_2_0_available_for_download) the availability of OpenRPT 2.0 for download (http://sourceforge.net/project/showfiles.php?group_id=132959) as of 2006 December 20.

2006.12.28

SourceForge.net has version 1.3.0 of both JasperReports (http://sourceforge.net/project/showfiles.php?group_id=36382) and iReport (http://sourceforge.net/project/showfiles.php?group_id=64348) available for download, though we haven't seen an official announcement (http://press.teleinteractive.net/oss/2006/12/28/current_version_confusion_for_jasperrepo) yet.

2006.12.27

Pentaho Corp. announced on December 14th (http://www.pentaho.org/news/releases/20061214_google_code_features_pentaho.php) that Google Code (http://code.google.com) is featuring Pentaho's AJAX-based integration with the Google Maps API. Pentaho's example was selected as a compelling and innovative example of the flexibility and power of the Google Maps API.

2006.12.26

SQL Manager for PostgreSQL v. 3.8 released (http://www.postgresql.org/about/news.704) Source: PostgreSQL News Links

2006.12.25

Lawrence Lessig, the Founding chairman of Creative Commons, retired last week (http://trends.newsforge.com/trends/06/12/21/1450219.shtml?tid=138). Creative Commons, a non-profit organization, is known for the Creative Commons copyright licenses.

2006.12.24

SpagoBI 1.9.1 demo environment (http://spagobi.eng.it:8080/sbiportal/faces/public/exo) has been out since 04 December 2006. "SpagoBI, enriched by a demo environment, offers an infrastructure supporting the analytical documents management, in term of activities to handle their approval process, release management, parametric activation and publishing in a portal environment with the visibility based on end-user's roles."

Source: SpagoBI News Page (http://spagobi.objectweb.org/)

2006.12.23

EnterpriseDB Named IT Success Story By New Jersey Technology Council (http://www.enterprisedb.com/news_events/press_releases/19_12_06.do)

2006.12.22

"Venture capital investments in 2007 will be aimed at enterprise-focused start-ups with products that push the innovation of wireless devices and services, security, open source and virtualization, industry experts say."

Source: LinuxWorld.com: The year ahead: VCs will back innovators with an enterprise focus (http://www.linuxworld.com/news/2006/122106-vc-enterprise.html)

2006.12.21

JasperSoft announced that it has passed the 5000 customer milestone (http://www.jaspersoft.com/nw_press_js_5000customers.html) with two million product downloads, 20,000 deployments, and 23 global partners. Pentaho (http://www.pentaho.com/) claims over 1.5 million downloads. SpagoBI (http://spagobi.objectweb.org/), openi (http://openi.sourceforge.net/) and Bee (http://bee.insightstrategy.cz/en/index.html) are other OSS projects providing full BI suites.

2006.12.20

When evaluating ETL tools, one feature to look for is the details of how the ETL tool handles and works with the combination of disparate source system types.

2006.12.18

BI query response time is important. Since most users tend to be impatient and want instantaneous results, overall system performance tuning and monitoring are important.

2006.12.17

The goals of decision support systems differ from the goals of transactional systems. When designing the database for data warehouse and decision support, remember to optimize for query processing speed and loading speed.

2006.12.16

With the complications and complexities of organizational culture and politcs, it is easy to fall into the trap of the "isolationist approach". Remember that the DW/BI effort can only be successful if one works with the other groups and there is a free exchange of information. Afterall, DW/BI brings together different data from different sources.

2006.12.15

It is important to do the process of gathering requirements thoroughly in order to create a BI roadmap.

2006.12.14

When considering BI solutions, one should consider how the solution would support long term goals and not just the immediate BI requirements. By creating a comprehensive BI strategy and roadmap, one considers the overall technical architecture, user needs, governance, etc., that are vital for the success of the implementation.

2006.12.13

ETL is one of the most challenging aspects of a BI/DW project implementation. The ETL process brings together, combines and transforms data from multiple source systems thus enabling users to work off an integrated data set.

2006.12.12

When gathering information on OSBI products for evaluation, post on open source mailing lists to gather insights and recommendations of other users on their experience as well as their opinions.

2006.12.11

The open source community of an Open Source product is vital its success. In evaluating OSBI products for your company, look at how active the community is and the type of community support.

2006.12.10

Understanding Open Source Software - by Red Hat's Mark Webbink, Esq. (http://www.groklaw.net/article.php?story=20031231092027900) is a good article on open source licenses. It also touches on US copyright law as well as how derivative works are defined.

2006.12.09

Understand the implications of the Open Source license of the OSBI tools you are using.

2006.12.08

The licensing agreement, whether open source or proprietary, should be an important part of your BI tools evaluation process. The primary open source licenses are GPL and Berkeley derived, but there are many more, including a new one being considered, that modifies the Mozilla license with a specific attribution clause, that is causing (http://blogs.zdnet.com/Berlind/?p=211) some controversy (http://www.nicholasgoodman.com/bt/blog/2006/11/15/open-source-has-a-little-secret-exhibit-b/).

2006.12.07
Not knowing who the stakeholders for a BI project is one common reason for scope creeps.

2006.12.06
A Super-type in data modeling is a means of classifying an entity that has sub-types.

2006.12.05
For an IT professional, it is easy to get caught in the details of the technical (data, architecture, etc) delivery of the BI systems. It is good to always evaluate the technical work to see how well it aligns to the user needs as well as its ease of use.

2006.12.04
In selecting BI tools, it is important to follow a formal evaluation process. Having a list of criteria is vital but should be accompanied by a methodology that would help the organization find the tools that fits their needs.

2006.12.03
Not identifying the proper stakeholders is one overlooked reason for BI scope creeps.

2006.12.02
In planning for your BI project implementation, it is good practice to define the activities that will be done. Once you have created a scope baseline, decompose the work into activity details.

2006.12.01
For BI initiatives to be successful, various parts of the organizations must collaborate, work together and share information. One of the means of doing this is by creating an enterprise governance which includes business sponsorship as well as establishment of guidelines, SLA, etc.2


November 2006


2006.11.26
A Function Dependency Diagram provides a visual means of recording and documenting interdependencies between business functions. It also shows events that cause functions to be triggered.

2006.11.25
A Dataflow Diagram is used to represent the use of data by processes or business functions.

2006.11.24
A matrix diagrammer is a tool that can be used to show the interrelationships of the different entities in your data model and their associated business functions.

2006.11.23
Functional errors are associated with the Functional Requirements that the BI system is designed to provide. Testing for these types of errors can be effectively performed by comparing the specifications document, design document and the completed system.

2006.11.22
As BI processing within organizations gets bigger and more complex, there is a greater need for organizational silos to disappear. There is greater need to share skills, resources and information that most individuals/groups keep to themselves.

2006.11.21
For many cases, BI portal provides a primary contact of interaction for users. Ensure that your users have a positive experience using it. Users do have a lot to say when it comes to the success of a BI implementation.

2006.11.20
A data integration strategy must take into account the different aspects of the organization some of which include, but not limited to, business processes, applications, infrastructure and user interaction.

2006.11.19
A well designed system can prevent performance problems when it is implemented and deployed. System designers and application developers must understand their environment’s processing mechanism in order to utilize these and optimize their design and code.

2006.11.18
System performance tuning is an iterative process. The goals are (1) to achieve the desired and defined performance metrics and (2) qualify and quantify in detail the environment that contributed to achieving the performance goals.

2006.11.17
Enhance your database performance by tuning the database application, the database engine, the operating system and networks.

2006.11.16
There are different factors that determine your system's performance. These include, but not limited to: usage patterns, system configuration (CPU, memory, disk) and software/application configuration. A system that is well-configured, used, managed and tuned properly will perform better than one that is not.

2006.11.15
The performance of many software code is greatly affected by disk I/O. Sometimes, the CPU waits until the I/o completes. Tune your SQL statement and determine if you can tune "I/O bound" statements.

2006.11.14
Regularly monitor the processes which are running on your system to minimize the likelihood of security breaches.

2006.11.13
Swapping is the process by which memory pages of a given process are written out to the swap device. Configure your swap space accordingly to avoid performance problems and degradation.

2006.11.12
Most databases use roles to manage database access permissions. This concept is different from the operating system users and permissions. Understand how your database implements roles and use it to enhance your security requirements.

2006.11.11
It is important to fully understand the requirements. If one jumps in and starts designing and coding the ETL without fully understanding the required BI functions, the design will fail.

2006.11.10
It is important to understand and map your current BI needs. A complete picture of the current state will help you know how to get to the future state.

2006.11.09
When implementing BI solutions, utilize database and operating systems related technologies that help meet scalability, availability, performance and security requirements.

2006.11.08
Partitioning large database tables and indexes provide several query performance benefits. If your database has support for partitioning, use partitioning techniques smartly.

2006.11.07
When selecting Open Source BI products for your organization, it is a good practice to have a formal evaluation process. Creating a list of criteria isn't enough. It is effective to implement a well thought of selection strategy make sure you find the product that best fits your needs.

2006.11.06
A database stored procedure provides a programmatic, procedural interface to the RDBMS. It is a combination of procedural control statements, SQL statements, and declarations that allow for creating program logic.

2006.11.05
For the identified risks in your BI project, create a Contingency Plan as well as a Fallback plan. A contingency plan provides alternative strategies to be used to ensure project success if specified risk events occur. A fallback plan provides alternative project approaches for risks that have a high impact or for an identified strategy that may not be fully effective.

2006.11.04
It is good practice to keep a Design Decision Document or a design log that records all of the issues, decisions and actions taken that crop up during your BI implementation. It helps prevent scenarios like one cannot justify why a certain feature exists because all past arguments and project history have long been forgotten. It is also a good background reading for new team members.

2006.11.03
Database views are virtual tables that one can use to retrieve data from underlying tables. As "virtual tables" one does not need to create physical tables to see a simplified/subset part of the database. Some other uses or advantages of database views include: ability to restrict table access and updates for security purposes and logical data independence which minimizes modification of the application if base tables require and restructuring.

2006.11.02
Database referential integrity prevents database inconsistencies when updates, deletes and updates are performed. Referential integrity states that a row cannot exist in a table with a non-null value for a referencing column if an equal value does not exist in a referenced column.

2006.11.01
It is obvious why backups are necessary. No one wants to suffer any types of computer system/data failure or catastrophe and not be able to recover without any data loss. So, it is important not only to have a backup strategy but tested process and procedures for Backup and Recovery.


October 2006


2006.10.27
In a document, an index provides for a mechanism for faster retrieval of information. The same is true for an index in the context of relational database management. An index helps improve performance by providing an efficient mechanism for searching rows on a table.

2006.10.26
Implement configuration management in your BI implementation. It will help identify changes, determine if changes have been implemented and help in verifying changes after changes have been completed.

2006.10.25
Ensure that during your BI implementation that an effective change control is in place. Change must be managed, documented and only approved changes must be implemented.

2006.10.24
Relational database modeling is based on mathematical set theory. In simplistic terms, the language of set theory is based on a single fundamental relation, called membership. Tables/Data in the relational model are represented as mathematical relations. In basic terms, it is a mathematical model of the business.

2006.10.23
The design process is iterative. Scope each design iteration. During the proof of concept stage, show something simple and get feedback from the users. Get feedback on what works and what needs to be improved upon. The next design iteration can include X number of requirements and so forth. Throughout the process, remember that successful designs model the business needs.

2006.10.22
The logical database design establishes the entities and their relationships. The logical design ensures that real world objects, their attributes and relationships are represented in the database design.

2006.10.21
When analyzing the requirements of your BI implementation, identify common operational reports. The most common frequently used cross-functional reports are usually the ones aligned with the most important business processes.

2006.10.20
During data discovery, survey the documentation related to the current legacy systems. This is valuable in determining existing operational systems that are likely to be sources for the BI system, reporting or DSS.

2006.10.19
As in any project, BI implementation needs executive project sponsorship. Without a project sponsor, it is likely that the project will fail.

2006.10.18
Manage the architecture of your BI implementation by considering all areas, including but not limited to, data, applications, technical architecture, tools (data access, ETL, etc.), support, etc. Architecture should be managed to ensure complete integration across all areas of the BI implementation.

2006.10.17
Security is an important component of the BI implementation. Ensure that a comprehensive security plan is implemented.

2006.10.16
An index is a mechanism that can be used by an RDBMS optimizer to enhance data access. Indexes avoid the necessity to perform full-table scans to locate a certain number of rows that a query needs to retrieve or update. Evaluate your BI SQL indexing strategies to optimize SQL performance.

2006.10.15
As a critical tool for the business, the BI implementation must evolve with the changing needs of the business. Static BI implementations give little or no value to the business.

2006.10.14
A properly designed and configured BI database provides efficient access to the required data to support queries and data mining requests.

2006.10.13
Ensure that there is a party responsible for process administration in your BI administration. This group will ensure that all the things that are suppose to happen for the data warehouse happens and that processes are followed to ensure that data accuracy and security. The group will also ensure that if there are any problems that are encountered that they are resolved.

2006.10.12
When designing BI applications one must think in terms of basic design requirements: simplicity, consistency and accuracy, query-processing speed and loading speed.

2006.10.11
During the creation of source-target mappings, create tasks that do the basic analysis as well as the transformation analysis.

2006.10.10
Do not drive the BI requirements based on the data but rather on the functions and needs of the enterprise.

2006.10.09
During the first few weeks of starting a BI implementation, ensure that the mission, vision and scope of the effort are quickly determined.

2006.10.08
Data quality is a common problem in BI implementations. To help keep data throughout the enterprise clean and consistent, establish a data governance program which includes a data stewardship.

2006.10.07
BI implementations must be business-driven and not technology driven.

2006.10.06
Most database technologies use internal statistics to optimize query execution. When writing SQL queries use and understand the optimizer of your database.

2006.10.05
Consider partitioning large tables and indexes to improve query performance. Queries run faster when separate processes access separate partitons.

2006.10.04
Create a set of metrics to quantify the quality of data being processed and usefulness of BI reports created for your users.

2006.10.03
As in any project, setting realistic expectation and managing the expectation of users is very important. Hence, for your BI implementation, negotiate a reasonable subject scope and identify incremental deliverables.

2006.10.02
A separate set of documentation should be created and maintained that is geared towards the support services staff of your BI implementation. It should include things like, instructions for performing certain tasks, standard operating procedures, guidelines on what to do when certain errors are encountered, etc.

2006.10.01
Plan, design and implement procedures for managing the user community throughout the life of the BI systems. It is important to remember that as the initial users of the BI system will change and evolve as the system is used more.


September 2006


2006.09.30
Identifying and keeping track of different data versions is an integral component of data taxonomy.

2006.09.29
Creating a data glossary of alphabetical listing of words, terms, acronyms, definition, abbreviation, etc. help provide an understanding of data used in the BI implementation.

2006.09.28
Creating a "data thesaurus" is helpful in providing reference between similar terms and names in the organization.

2006.09.27
Determine the different types of users of your BI implementation. Provide user access that is based on role assignments and associated security permissions.

2006.09.26
It is helpful to understand the different supported features of the database you are using. Understanding certain features could help in optimizing your database operations. E.g.: If you are using PostgreSQL, it may be better to use the COPY command instead of the INSERT command for loading of large numbers of rows. COPY use less overhead for large bulk loads as compared to INSERTS though it is less flexible.

2006.09.25
Creating a Partial Index is usually helpful in a scenario where one queries a specific value more often than others. E.g. If you are using PostgreSQL, one can create a partial index like:

CREATE INDEX customer_idx on customer(active) where active=’TRUE’;

In this scenario, active customers are displayed and/or queried often. The partial index on customer whose value is TRUE could help in the performance of the query.

2006.09.24
Use SQL Tuning tools for your database. Examine how the SQL query of your application performs and see exactly how the database will process the query. E.g. For PostgreSQL, use SQL command EXPLAIN ANALYZE.

2006.09.23
When scoping a BI implementation, it is helpful to survey the existing documentation related to existing operational systems that are likely to be source systems.

2006.09.22
A metadata repository contains and describes the various aspects of the data in the data warehouse.

2006.09.21
A datamart that uses HOLAP or Hybrid OLAP utilizes a combination of multidimensional database structures and relational tables. The multidimensional database structures are used for precalculated summary information while the relational tables are used for detailed data.

2006.09.20
Successful BI/DW designs model business needs. Make the design process iterative. Start with something simple, e.g. proof of concept, and then get feedback on what works and future capabilities.

2006.09.19
Data quality involves many facets including data stewardship, data governance, data cleansing, metrics, processes, etc. Include data quality in your BI planning and implementation.

2006.09.18
As the BI implementation matures, the support needs of the user changes. Indications like system utilization, statistics, requests for additional training, user support calls, frequency of requests for new functions, etc. provide how much the BI implementation is used and how much the users know the system. Knowing these indications is helpful in understanding and fine tuning your BI system and its infrastructure, including your support capabilities.

2006.09.17
Master/detail relationships between tables based on primary and foreign keys are enforced using referential integrity constraints. By requiring the detail record to have a corresponding record in the master, referential integrity constraints prevent invalid INSERT, DELETE, and UPDATE.

2006.09.16
Monitor and measure your BI infrastructure performance both in quantifiable and non-quantifiable measures. Changes in any of these measures can impact the performance of the BI implementation.


2006.09.15
I/O and memory inefficiency and bottlenecks are common causes for poor database performance when the data warehouse experiences data load growth. Tune the database as well as examine and optimize SQL statements being executed.


2006.09.14
Performance benefits for Low cardinality and rarely changed attributes can be gained from bitmap indexing, clustering and hash indexes.


2006.09.13
Perform capacity planning for your BI implementation. Design the system so that the ideal response time can still be met under peak resource utilization.


2006.09.12
When designing database processing for your BI implementation, create load and transformation modules that are broken down into smaller and independent modules which can be grouped together. When ever possible, use stored procedures and packages with strong exception handling standards.


2006.09.11
Bitmap indexes are ideal for low cardinality columns such as gender, marital status, etc. Bitmap indexes help increase performance of queries.


2006.09.10
Continuously identify core information that is used in different enterprise applications that would benefit from centralization and your BI implementation.


2006.09.09
To successfully support the capture, integration, and use of consolidated data repository for your BI implementation, institute policies and procedures for data governance as well as implement data stewardship.


2006.09.08
To maximize the benefits and attractiveness of your BI implementation, understand how current IT investments and resources can be leveraged.


2006.09.07
"The most prevalent driver of consolidation is the need for consistent data across the enterprise. Almost all survey respondents (90 percent) rated this factor as “very high” or “high.” This was followed by “reducing costs and overhead” (71 percent) and “standardizing the IT portfolio” (50 percent)."
Source: " TDWI Report Series: In Search of a Single Version of Truth: Strategies for Consolidating Analytic Silos (http://www.tdwi.org/research/display.aspx?ID=7173)



2006.09.06
Implement an incremental, iterative approach to your BI implementation. Build momentum and support along the the way, as well as align the BI efforts to corporate strategy and implementation plan.



2006.09.05
The main objective of a data integration project is to provide a unified view of the data that exists in different parts of the organization.


2006.09.04
when managing the schedule and critical activities of your Open Source BI project, the other aspects of the project - Scope, Cost, Quality, Resource Management, Communication, Risk, and Procurement Management should not be neglected or have less emphasis.
See TIA Press Yackity Blog Blog Float in Project Time Management (http://press.teleinteractive.net/yackity/2006/09/04/float_in_project_time_management)


2006.09.03
Positioning OSBI as providing value with respect to cost, technologies, people and culture will go a long way.
See TIA Press OSS Blog Open Source Business Intelligence Adoption (http://press.teleinteractive.net/oss/2006/09/03/open_source_business_intelligence_adopti)



2006.09.02
When assessing your existing BI implementation, determine if there are operational reports that don't belong in the BI environment.


2006.09.01
Foreign keys selected in a foreign relation must exist as primary keys in the primary relation. E.g. if the Customer orders table has a foreign key CUST_ORDID that links to the Shipment Fact table, the Shipment Fact Table cannot be deleted or dropped if the corresponding CUST_ORDID exists.


August 2006



2006.08.31
Implement an overall data security architecture in your BI implementation.


2006.08.30
An Operational Data Store (ODS) provides an efficient interim way to store time-sensitive data for use in the data warehouse.


2006.08.29
"Facts are “What” / Dimensions are “How”" - Nicholas Goodman on Business Intelligence blog entry August 25th, 2006 entitled: Overview of Business Intelligence/DW (http://www.nicholasgoodman.com/bt/blog/)


2006.08.28
"... change OpenReports Portal to MarvelIT DASH Portal. We have changed the name and are not associated with the Open Reports project any longer.

We offer a Dashboard solution that is based on the Apache Jetspeed portal and our unique charting portlets. Using Marvelit DASH - dashboards can be built very easily using the power of SQL and XML. See our demo at <a href="demo.marvelit.com" target="_blank">demo.marvelit.com."</a> - Rick Mortensen, CEO of MarvelIT in an email to InterActive Systems & Consulting, Inc. (http://http://www.iasc.com)

See TIA Press OSS Blog MARVELit Dash Portal Solution (http://press.teleinteractive.net/oss/2006/08/27/marvelit_dash_portal_solution)


2006.08.27
When justifying a BI System Implementation, quantify and provide specifics. See TIA Press OSS Blog Justifying a BI System (http://press.teleinteractive.net/oss/2006/08/26/justifying_a_bi_system)


2006.08.26
Always involve the business users in your BI implementation. Uninvolved business constituents lose interest in the project or sometimes become unwilling to share in the responsibility.


2006.08.25
Project Managers of BI implementations can use the ABC Analysis to help analyze the performance of people issues that may affect the project outcome.

See Yackity Blog Blog Post: ABC Analysis (http://press.teleinteractive.net/yackity/2005/05/28/abc_analysis)



2006.08.24
A BI/data warehouse steward, working with IT, can effectively settle different data representation and transformation rules by working with the user community.


2006.08.23
Remember that the BI implementation is successful only if it is effectively used. Train users to use the system and how to think analytically. E.g. users can be taught that they can actually create reports like: Top customers in terms of revenue in the Northeast Region for the product line X.


2006.08.22
Justifications for BI implementation that support strategies for enhancing revenue generate stronger and longer term support for higher management.


2006.08.21
Continuously review your BI implementation to ensure that changing business requirements are evaluated, planned for and implemented.


2006.08.20
The concept of multidimensional cube is a good way to help users understand how they may want to query the multidimensional database or create OLAP reports.

See Yackity Blog Blog Post: Multidimensional Cube - Simple Explanation for Users (http://press.teleinteractive.net/yackity/2006/08/16/multidimensional_cube_simple_explanation)


2006.08.19
The Third Normal Form states: "No non-UID can depend upon another non-UID attribute".

See Yackity Blog Blog Post: First 3 Rules of Data Normalization for Newbies - Part 3 (http://press.teleinteractive.net/yackity/2004/12/21/first_3_rules_of_data_normalization_for_4)


2006.08.18
The Second Normal Form states: "An attribute must depend upon its entity's entire UID".

See Yackity Blog Blog Post: First 3 Rules of Data Normalization for Newbies - Part 2 (http://press.teleinteractive.net/yackity/2004/12/20/first_3_rules_of_data_normalization_for_3)


2006.08.17
The First Normal Form States: "All attributes are not repeated or must be single-valued"

See Yackity Blog Blog Post: First 3 Rules of Data Normalization for Newbies - Part 1 (http://press.teleinteractive.net/yackity/2004/12/19/first_3_rules_of_data_normalization_for_2)


2006.08.16
Even if dimensional modeling is highly denormalized, it is still important to understand the 3 rules of Data Normalization. Normalization helps eliminate the problems of redundancy in database design.

See Yackity Blog Blog Post: First 3 Rules of Data Normalization for Newbies - Introduction (http://press.teleinteractive.net/yackity/2004/12/18/first_3_rules_of_data_normalization_for_1)


2006.08.15
The three basic constructs of Entity Relationship Diagrams (ERD)are:

  • Entity: A significant thing (place, person, concept, event) which information needs to be known or held
  • Attribute: A characteristic that serves to identify, quantify, classify or qualify an entity
  • Relationship: The way in which two entities are related

See Yackity Blog Blog Post: ERD - Basic Constructs (http://press.teleinteractive.net/yackity/2005/06/19/erd_basic_constructs)


2006.08.14
CRUD (Create Read Update Delete) Matrix is one technique to map the data model to the process model. The CRUD matrix helps identify:

  • Missing Process : A business process exists that is not supported by the model; The data model needs to be adjusted to include the missing process
  • Redundant Data Entity : A data entity exists that is not required to support a process

See Yackity Blog Blog Post: Technique to Map Business Process to Data Model (http://press.teleinteractive.net/yackity/2005/06/16/technique_to_map_business_process_to_dat)


2006.08.13
During requirements gathering, interview the users. Identify performance metrics that they want to analyze.


2006.08.12
Data silos usually occurs when different department create their own data marts, e.g. marketing has a marketing data mart; finance has a finance data mart, etc. Focus on the business processes and avoid data duplication.


2006.08.11
It is important to know how your BI implementation is being used. Measure the benefits that the BI system brings to the enterprise, how it is used and areas where it can be improved.


2006.08.10
If consultants are used to help in the development and implementation of a BI project, ensure that proper knowledge transfer is done.


2006.08.09
Train users to use the system. A BI implementation is not successful if users can't use it.


2006.08.08
Users will not trust a BI implementation with dirty, unusable data. Engage users and stakeholders in the data cleansing efforts.


2006.08.07
Establish a Service Level Agreement (SLA) that for the BI implementation. Some areas for the SLA include, but not limited to: uptime, business continuity, adhoc query response time, data quality, etc.


2006.08.06
A BI project is not just an IT project or a Business User project. Partnership between business and technology will ensure that all parts integral to the project work together.


2006.08.05
A BI implementation that has to be redefined and reformulated due to incorrect assumptions and architectural errors is very costly. Ensure that a BI strategy and roadmap has been created that looks at current needs as well as takes into consideration future enterprise needs.


2006.08.04
Part of proper BI project scoping is proper definition and estimation of activities. Overall effort must be estimated in a systematic, methodical manner using project management techniques.


2006.08.03
Setting the expectations of stakeholders is important in any project. A lack of understanding of the complexity of one’s BI implementation often leads the project astray.


2006.08.02
For BI projects to deliver enterprise business value, challenges involving both the business and technical aspects must be addressed.


2006.08.01
Communication is very important in a project. When possible, create a portal or a collaboration site where your BI stakeholders can go to and exchange information. Some information that can be provided include (but not limited to): Frequently Asked Questions (FAQs), Key Performance Index (KPI) and Service Level Agreement (SLA) reports governing utilization, Training Materials, etc.

July 2006


2006.07.31
When creating your BI Operations and Maintenance strategy, make sure that it covers the policies to review, monitor, manage, and improve your current and future architecture and infrastructure.


2006.07.30
In addition to the adequate planning and resource management to ensure that the BI system meets the need of the enterprise, an on-going system assessment and performance must be undertaken. These ensure help insure that your BI system is realizing its benefit.


2006.07.29
When planning for the deployment of your BI system, remember to define the service level agreement (SLA) with your users.


2006.07.28
An article by www.advisor.com (http://www.advisor.com/doc/11874) states that the adoption of Business Analytics is challenged by understanding and managing user expectations as well as budget constraint.


2006.07.27
When choosing a hardware platform for your BI implementation, plan for growth. Perform database sizing to determine your server performance requirements. It is important to select a scalable platform.


2006.07.26
Prioritize your requirements and create a plan that delivers a phased, iterative, BI system implementation. The approach encourages early wins that build support throughout the organization.


2006.07.25
During data analysis, have a clear understanding of data relationships among all your source systems.


2006.07.24
It is good practice to keep track of data that have changed in your BI system. One way to do this is to create an audit database table that keeps track of when the data has been modified, type of modification, etc. This table can be populated by the use of database triggers.


2006.07.23
In choosing BI tools, it is important to understand the relationship between the functions of the analytic tools and the requirements and preferences of the user community and your BI architecture.


2006.07.22
Well-planned BI system maintenance and operations, prevents chaos when a production problem is encountered. Test and document step-by-step operational procedures, e.g. restoring a database table or database restore. Establish escalation procedures and up-to-date contact list.


2006.07.21
Shortcomings in the data most almost always cause users to mistrust or not use the implemented business intelligence system. Determine the right data by focusing on the business needs and requirements. Create a BI roadmap and follow it.


2006.07.20
It is critical to the BI framework and implementation to understand the enterprise's core business processes generates or collects metrics. These performance metrics support the users’ analytic needs. E.g., the ordering business processes support numerous reports and analysis like sales analysis, sales person performance, to name a few.


2006.07.19
Design and optimize your data warehouse. It is important to remember that in a data warehouse, query speed is very important. Most queries tend to be adhoc with users (not DBAs) creating and running reports.


2006.07.18
Find efficient ways to do your ETL. A copying and re-copying the contents of the entire physical database, again and again, even if no data was changed since the last refresh is resource and time consuming. Determine with your users how often the data warehouse is refreshed (i.e. daily, hourly, etc.). Then, implement a strategy that only “net changes” on the data are extracted, transformed and loaded into the data warehouse.


2006.07.17
Data aggregation helps improve the performance of some business intelligence tools. Storing aggregates also reduces the space needed in the database. However, too much aggregation may prevent users from looking at all the details they need. Determine your need for summary and detailed data. Balance your data aggregation implementation so the data warehouse stores both detailed and aggregated data without sacrificing performance.


2006.07.16
Establish data management procedures early in your BI implementation. When establishing data management best practices, keep in mind that data management involves standardizing and reusing data, preventing data silos and ensuring that a set of enterprise data used does not change across departments and applications.


2006.07.15
Effective, rigorous and sustained data management also includes data stewardship and data governance.


2006.07.14
"Under a stewardship program, everybody shares responsibility and accountability for data excellence; information belongs to everybody, and like other (more empirical) corporate resources, it must be scrupulously managed and cared for."
Source: The Case for Data Stewardship, Article published in DM Review Magazine, February 2005 Issue (http://www.dmreview.com/article_sub.cfm?articleId=1018108)


2006.07.13
Data stewardship involves managing information necessary to support and ensure that data captured and reported is accurate and accessible for timely decision making and activity monitoring.


2006.07.12
Data quality has a big impact on the success of the BI initiative. Institute a Data Stewardship program that draws on the expertise of the business and IT.


2006.07.11
Data Quality has a strong impact on the usefulness and effectiveness of BI implementations.


2006.07.10
Aside from right architecture and strategy, another key to a successful BI project is to gain top management commitment and support.


2006.07.09
To ensure successful administrative project closure of your BI project, plan and document the procedures by which the deliverables will be accepted and the project close.


2006.07.08
When monitoring and controlling your BI project, analyze the performance of your project. Sample techniques include: Variance Analysis, Trend Analysis, to name a few.


2006.07.07
When defining how to manage the quality of your BI project, it is important to define (who, what, how, why, when) quality metrics.


2006.07.06
Just like any project, make sure your BI project has a project management plan that includes how scope, time, cost, quality, human resources, communications, risks and procurement will be managed.


2006.07.05
Create a Communications Plan for your BI project that includes how to manage stakeholders, i.e., ways to communicate with them, how to identify their requirements and how to resolve their issues.


2006.07.04
Plan how to manage the scope of your BI project including the process to verify deliverables and the process to control scope changes.


2006.07.03
As in any project, it is important to determine which project risks might affect the project and to develop a project risk response, i.e. options and actions, for the identified risks.


2006.07.02
TDWI Research by Wayne Eckerson entitled In Search of a Single Version of Truth (http://www.tdwi.org/research/display.aspx?ID=7173) shows that “on average organizations have 2.1 data warehouses, 6 independent data marts, 4.5 operational data stores and 28.5 spreadmarts.”


2006.07.01
The effectiveness of BI solutions in the enterprise is undermined if the organization has multiple data analytic silos with redundant data and BI tools.

June 2006


2006.06.30
When selecting BI tools, organizations must create evaluation criteria that do not only cover technical areas.


2006.06.29
BI is not just a reporting tool.


2006.06.28
Strong executive support is essential in a successful BI implementation.


2006.06.27
Sustain BI initial implementation success by:

  • maintaining continuous buy-in from stakeholders
  • proactive support and maintenance
  • evangelization of how BI implementation can be leveraged to support business goals and strategies
  • regularly delivering new BI functions and iterations


2006.06.26
BI query response time is important. Since most users tend to be impatient and want instantaneous results, BI database administrators need to tune the database performance and be proactive about response times.


2006.06.25
Before an organization purchases a BI tool or decides to use Open Source BI tools, it is important to understand the current architecture of the BI environment or architect the environment if no architecture exists.


2006.06.24
In delivering BI reports, it is important to focus on how well users can digest and easily use their reports.


2006.06.23
It is important for BI professionals not to get caught up in the details of delivering the technical and data architecture that underlying business needs that should to be satisfied are forgotten.


2006.06.22
Reliable report query performance garners user acceptance.


2006.06.21
Effective data analytic applications make it easy for a range of users to access and use their information.


2006.06.20
BI tools that have scorecard capability provide a way for measuring performance and delivering key metrics to the appropriate users. Effective scorecard and interactive dashboard applications provide a summary set of metrics. When a metrics indicates a problem, it should give the user the ability to drill down.


2006.06.19
Dashboards or scorecards for BI applications provide a graphical display that compares performance against defined predefined goals. It can provide a visual way for exception reports.


2006.06.18
For the BI professional, data integration skills are essential in roles such as data steward, data acquisition architect, source data analyst, and ETL developer. It is also valuable in other roles such as business intelligence architect, metadata administrator, and quality administrator.


2006.06.17
Data Integration is fundamental to business intelligence. It includes all of the activities necessary to acquire data from sources, and to transform and cleanse the data. It is an important process in delivering a rich and robust data resource for the enterprise.


2006.06.16
In data integration projects, ETL is used not only to solve transformation complexity caused by the need to restructure and aggregate data. It can also be used to cleanse and reconcile data to improve data quality.


2006.06.15
Common uses of data integration include:

  • historical and operational data reporting
  • adhoc reporting
  • performance management and
  • operational analysis


2006.06.14
A data integration strategy and infrastructure must take into account the needs of the organization as well as the application, business process, and user interaction strategies of the organization.


2006.06.13
It is recommended that organizations have a long term objective to create a flexible data integration architecture that evolve over time. An enterprise architecture is important especially if the organization has a complex heterogeneous data environment involving large volumes of data.


2006.06.12
Integrating disparate data is a difficult task. Common issues encountered are:

  • Data Quality
  • Security
  • lack of a business case which results to inadequate funding
  • poor data integration infrastructructure


2006.06.11
A DW success survey published by TDWI’s Business Intelligence Journal, Fall of 2005 (http://www.tdwi.org) shows that:

  • better quality information
  • improved productivity and
  • speedy retrieval

are the top three important critical success factors for data warehouse implementations.



2006.06.10
Poor data quality in the BI makes the solution useless because it does give the users an accurate “single version of truth”.


2006.06.09
One of the critical success factors for building BI solutions is ensuring that the solution is developed with improving user productivity in mind.


2006.06.08
Establish communication channels with the users and future users of the BI project. If time is invested in establishing lines of communications early in the project, there is a higher probability that users are happy with the end result.


2006.06.07
Quality metrics define how quality will be measured. Develop quality metrics for the BI project as well as a quality checklist to ensure that all requirements delivered will pass the quality control for the project.


2006.06.06
Stakeholders of the BI project have expectations of what the BI solution would deliver. These expectations and needs must be analyzed, understood into measurable objectives and documented requirements. If this is not done, the project has a high risk of failing user acceptance.


2006.06.05
Like any project, clearly define the scope your BI project. This enables a detailed understanding of the requirements that are going to be executed, verified and delivered.


2006.06.04
Sustain and build long term support for your BI solution. Marketing and selling the BI vision does not stop with getting the BI project approved.


2006.06.03
A Proof of concept project is helpful in validating the suitability of particular Open Source BI technology platform for the organization.


2006.06.02
To ensure successful roll-out of the BI solution and its effective use, adequate staff should be assigned to support ongoing operations including performance monitoring/tuning, production operations, maintenance and enhancements, etc.


2006.06.01
End user tools are important to how successful the data warehouse/business intelligence solution is. Tools enable users to access their data and get the "information" they need.


May 2006


2006.05.31
Get user input to indicate the historical data requirements of the data warehouse. There is no one formula to determine how much historical data should be kept in the data warehouse. It depends on the needs of the business and varies from business to business.


2006.05.30
When implementing a BI solution, it is important to document the different systems source systems and any existing cross data flow and meta data exchange between them.


2006.05.29
BI projects are not standalone projects for the enterprise. More so, in some large organizations, there may be multiple data warehouse/data mart teams. Create an opportunity or forum for cross-functional/cross-team communication. This will facilitate collaboration and exchange of information.



2006.05.28
Implementation does not only involve the completion of the development of all the components of the BI systems and passing the user acceptance testing. Users and IT support group(s) must also be trained.


2006.05.27
During the Construction and Development phases, ensure that unit and integration testing are done. E.g. During ETL processing, the data usually undergoes a complex set of transformation. Ensure that as data is loaded into the data warehouse, all required business rules for the the data warehouse are satisfied.


2006.05.26
Scoping is one of the challenging task of a BI project. Prioritize your requirements and deliver them in increments.


2006.05.25
Don't let your BI system be tagged as the system that was built not designed.


2006.05.24
Data mining incorporates the use of mathematics and statistics to analyze business-related data.


2006.05.23
A BI project is a cross-organizational decision support solution that should be engineered to support both technical and non-technical infrastructure.


2006.05.22
Create the BI roadmap that enables parallel development tracks where iterative multiple steps can be performed. Typical Parallel development tracks include:

  • Backend/Metadata Repository track: Modeling and creation of target database and the metadata
  • ETL track: Design, development and population of target database from source systems
  • Frontend/Application track: Delivery of reports - OLAP, Adhoc, etc


2006.05.21
Established access controls for your BI systems provide the ability to manage users or group of users.


2006.05.20
Evaluate and monitor ETL performance to determine response time, "bad" SQL statements, resulting network load, etc.


2006.05.19
Plan and design the ETL recovery processes for efficient handling of problems that occur during the ETL process.


2006.05.18
Data mappings done for the BI system are inputs to the data model, design and code of any simple or complex ETL transformations.


2006.05.17
Staging tables during the ETL process could be used to analyze and understand source data.


2006.05.16
BI systems help organizations integrate, share and use its disparate data.


2006.05.15
During ETL process, disparate data need to be transformed to resolve any existing disparity and prevent any future ones.


2006.05.14
The process of building long term support for BI implementations is aimed at marketing and selling the benefits for continued investment in the BI system.


2006.05.13
It will be difficult to justify the effort that is needed to make a business intelligence system successful without an explicit connection to the strategic objectives of the organization.


2006.05.12
Experience has shown that justifications for a BI implementation that support revenue enhancing strategies generate longer term and stronger support from upper management.


2006.05.11
Explore, evaluate and decide upon the scope of the business intelligence system, e.g. What business functions will be supported, what facts are needed for the functions, what data sources will be used to create the facts, who the users will be, what business units will be served.


2006.05.10
Business intelligence systems help companies use their data to make informed decisions. Empower the business users to create or modify their BI own reports.


2006.05.09
A rigorous methodology to implement BI in the enterprise that aligns people, processes and technology helps decrease costs while reaping the benefits of BI for an increased ROI.


2006.05.08
Whether an organization is implementing Open Source Business Intelligence or using commercial products, having a BI Strategy and Roadmap provide foundation that accelerate and ensure the success of the BI initiatives.


2006.05.07
A press release by UK’s National Computing Centre (http://www.ncc.co.uk/aboutncc/aboutncc.cfm) says that Over two thirds of senior IT Professionals expect businesses to develop Open Source strategies (http://www.ncc.co.uk/aboutncc/press_rel/open_source_survey_results.cfm)



2006.05.06
According to Gartner's Positions on the Five Hottest IT Topics and Trends in 2005< (http://www.gartner.com) Gartner's Positions on the Five Hottest IT Topics and Trends in 2005, by 2008 Open Source Software applications will directly compete with closed-source products in every software infrastructure market.


2006.05.05
Computer Economics Survey (http://http://www.computereconomics.com/article.cfm?id=1043) shows that according to their survey respondents, reduced dependence on software vendors is the number 1 advantage of Open Source Software. Lower total cost of ownership is only second.


2006.05.04
When evaluating what Open Source software for your enterprise, it is important to thoroughly understand the type of license and its implications.


2006.05.03
Some areas many companies consider when evaluating Open Source are:

  • Code maturity and testing
  • Open source project timeline and evolution
  • Open source project milestones
  • Open source community
  • Support Availability and Cost
  • Tools availability for deployment and maintenance
  • Software/hardware industry acceptance/inclusion
  • System integrator support/availability


2006.05.02
Open source software evolves because of its community.


2006.05.01
OSI (http://www.opensource.org/) promotes the Open Source Definition (http://http://www.opensource.org/docs/definition.php).


April 2006


2006.04.30
Establish Training and Deployment Plans to ensure smooth implementation.


2006.04.29
Make testing and QA an integral part of the BI project development.


2006.04.28
During Design review, some areas that should be covered are: - BI Features and Functions, reports to be delivered including those that have been changed or postponed for the next iteration

  • Data Model
  • Meta Data
  • Data Mart/Data Warehouse initial load model
  • details of each source-target flow including related transformation issues
  • front-end user experience
  • system and infrastructure review
  • Securing, architecture and model
  • End-user environment


2006.04.27
ETL is usually the most complex part of the BI project. Plan for and design all cross-plaform flows of the data and the mechanism to which they will occur. Some of the the things to take into consideration include:

  • Frequency
  • Data Volume
  • ETL Recovery Process
  • Change Management, i.e., impact changes in any data structure of the source systems


2006.04.26
When doing Source Data Analysis, some of the things to determine and analyze are:

  • Scope, content and structure of each data source
  • data quality
  • data integrity problems
  • transformations needed for source-target mappings


2006.04.25
Creating a Function-Fact matrix is one scoping technique that can facilitate the identification and recording of BI requirements.


2006.04.24
Bringing together the right technical skills and business knowledge in the BI implementation team provides a big impact in the implementation speed and success of the project.


2006.04.23
Organizations are constantly changing and so are the data resources that support it. It is a good strategy to create and manage metadata versions.


2006.04.22
It is a good practice to createa glossary of terms listing definitions and abbreviations that will be used in the BI implementation.


2006.04.21
Creating a common data architecture helps provide standards for defining common metadata.


2006.04.20
There are many challenges for organizations in having a common metadata since it could be difficult for people to agree on a common meaning. Metadata helps ensure that BI projects provide reliable information. Metadata provides BI users with the information on - where data came from and what it means.


2006.04.19
Business experts, data experts and domain/source system experts must be part of the team that defines common metadata for the data warehouse/BI implementation.


2006.04.18
Data describing metadata is also known as meta metadata.


2006.04.17
Metadata in its simplest sense is data about the data.


2006.04.16
Bitmap Indexing improves database access of records that can be selected through low cardinality attributes. An attribute has a low cardinality when its values are rarely updated and have a small number of distinct values. e.g. Gender is a good candidate for bitmap indexes.



2006.04.15
The execution path of a star query can be complex and data intensive especially when all the primary keys and foreign keys used in the join are indexed. It is a good practice to fully take advantage of the optimizer of the database you are using.


2006.04.14
Data Manipulation Language (DML) is a set of SQL commands that allow creation, select, elimination and modification of data values stored in the database. E.g. INSERT, SELECT, UPDATE, DELETE.


2006.04.13
Data Definition Language (DDL) is a set of SQL commands that defines the creation, modification or deletion of database objects


2006.04.12
Foreign keys implement integrity constraint. The concept of Foreign Key requires that every value used in the column(s) match with a single value in the corresponding reference table’s primary key column(s).



2006.04.11
Referential integrity refers to the integrity of the data structure that ensures that a parent data occurrence exists for each subsequent data occurrence. Referential integrity ensures consistent data.



2006.04.10
In relational model theory, the integrity of the entity is maintained through the use of primary keys and column constraints. The primary keys must not be null and must have unique values. Create unique index on the column(s) that comprises the primary or unique key.


2006.04.06, 2006.04.07, 2006.04.08, 2006.04.09

Design Goal 1 on Aggregates:

"Aggregates must be stored in their own fact tables, separate from the base atomic data. In addition, each distinct aggregation level must occupy its own unique fact table."

Design Goal 2 on Aggregates:

"The dimension tables attached to the aggregate fact tables must, whenever possible, be shrunken versions of the dimension tables associated with the base fact table."

Design Goal 3 on Aggregates:

"The base atomic fact table and all of its related aggregate fact tables must be associated together as a "family of schemas" so that the aggregate navigator knows which tables are related to one another." "

Design Goal 4 on Aggregates:

"Force all SQL created by any end user data access tool or application refer exclusively to the base fact table and its associated full-size dimension tables." "
Source: The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses (http://www.amazon.com/gp/product/0471153370/sr=8-3/qid=1141322102/ref=pd_bbs_3/104-3789305-4076712?%5Fencoding=UTF8)


2006.04.05

"As a general rule of thumb, it is reasonable to plan on a 100 percent storage overhead for aggregates as a target for the overall data warehouse."

Source: The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses (http://www.amazon.com/gp/product/0471153370/sr=8-3/qid=1141322102/ref=pd_bbs_3/104-3789305-4076712?%5Fencoding=UTF8)



2006.04.04
When deciding what to aggregate for your data warehouse, get a sense of the common business requests the data warehouse receives and discover the statistical distribution of your data.


2006.04.03
For very large data warehouse, use of aggregate or summary data has significant effect on improving query performance.


2006.04.02
After doing a prototype, conduct a lessons learned review on usability, design issues, data challenges, and other issues encountered.


2006.04.01
When developing BI prototypes, don't forget performance. Understanding and confirming a design used for prototypes also means designing for performance and scaleability.


March 2006


2006.03.31
Create prototypes that are not throw away projects at the end. Treat prototypes as BI pilot projects, a starting point and input to the iterative development process.


2006.03.30
When doing a BI prototype, start with a manageable and well defined scope. Use data that is meaningful to the users.


2006.03.29
BI prototypes are useful for doing a BI proof of concept and tools/products evaluation. Regardless of the objective of the prototype, use the prototype development exercise to understand source system data as well as the infrastructure and support requirements.


2006.03.28
Developing a prototype is a good way to confirm that the design and technologies used will meet the enterprise BI needs.


2006.03.27
When creating a BI architecture, identify the capabilities that are most important to the information needs of the enterprise.


2006.03.26
When data scope for the BI implementation is unclear and presents a project risk, facilitate work sessions with business user sponsors and data owners to agree on data subject scope and priorities.


2006.03.25
When creating a BI strategy, partner with the business users. A clear understanding of business information needs and priorities coupled with an assessment of the current systems are essential.


2006.03.24
When implementing a BI solution, it is a good practice to proactively and formally address operational quality issues.


2006.03.23
Upfront planning and strategy creation provide good foundation to successful data warehouse/business intelligence implementation.


2006.03.22
Online Transaction Processing (OLTP) systems are used to meet day to day operational data needs of the enterprise. Online Analytical Processing (OLAP) systems are used to meet analysis and decision support data retrieval needs of the enterprise.


2006.03.21
The goals of decision support systems differ from the goals of transactional systems. When designing the database for data warehouse and decision support, remember to optimize for query processing speed and loading speed.


2006.03.20
When designing the database for the data warehouse, create a schema that models the business needs.


2006.03.19
Integrated data from different sources would not be effectively used if it is not modeled and organized based on the perspective and needs of the business.


2006.03.18
A comprehensive data definition provides a way for organizations to thoroughly understand the data in their BI implementation.


2006.03.17
Having a common understanding of the content and meaning of data is necessary in implementing an effective BI solution. E.g. a business unit may compute sales commission differently than another business unit within the same organization. Hence, it is good practice to work with the business users to identify and define terms that are used.


2006.03.16
It is good practice to identify metrics to measure and report data quality of one's BI implementation.


2006.03.15
Source data often contains derived data. Understanding and documenting how the data is derived is essential in ensuring high quality data.


2006.03.14
During the analysis of source systems, it is good practice to identify and document the integrity, accuracy and completeness of the source data. E.g. one can document the percent of records that violate unique constraints


2006.03.13
Data Quality involves data integrity, data accuracy, data completeness, data consistency, and timeliness.


2006.03.12
Data integrity deals with the integrity of data value, data structure, data retention and data derivation.


2006.03.11
To make data sources share identical dimension is to create Conformed Dimensions. E.g. If one has two data sources and both have the same dimension like time, to conform these means to express them in its lowest level of granularity that is common to both data sources. Hence, if one data source is based on a weekly shipping transaction and the other data source is based on a monthly shipping transaction, the best way to conform these data sources is to base both sources into daily shipping transactions.


2006.03.10
A dimension attribute is discrete if it takes on a fixed set of values or if it is textual. E.g. Description of a Product


2006.03.09
A Dimension is a descriptive way to analyze the business. The characteristics of a good dimension are, it is discrete, textual and usually serves as the header in a user's report.


2006.03.08
When a fact table has no facts, it is called a Factless Fact table.


2006.03.07
If one can not add a numeric fact along dimensions, it is a nonadditive Fact. If it is numeric, it is usually used to combine in a computation with other fact. If it is not numeric, it is usually used for groupings or constraints.


2006.03.06
If one can add a numeric fact along some dimension and not on other dimensions, it is a Semiadditive Fact.



2006.03.05
Fact is a business measure. The characteristics of a good fact are: additive, continously valued and numeric.


2006.03.04

Enlarge
In Dimensional Modeling, snowflaking generally slows down browsing performance. To snowflake is to decompose a normalized dimension into a tree structure.



2006.03.03

Steps in Dimensional Design Process:

  1. Choose a business process to model
  2. Choose the grain of the business process
  3. Choose the dimensions that will apply to each fact table record
  4. Choose the measured facts that will populate each fact table record.


Source: The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses (http://www.amazon.com/gp/product/0471153370/sr=8-3/qid=1141322102/ref=pd_bbs_3/104-3789305-4076712?%5Fencoding=UTF8)


2006.03.02
In Dimensional Modeling, the grain of the fact table is the lowest level of data to be represented, e.g. individual sales transaction.


2006.03.01
A Dimension is a descriptive way to analyze the business. One way to think of a Dimension is: it is preceded by the word "BY", e.g. BY time, BY region.


A Dimension Table is one of a set of companion tables to a fact table. Each dimension is defined by its primary key that serves as the basis for referential integrity with any given fact table to which it is joined.

February 2006


2006.02.28
In Dimensional modeling, a Fact is a business measure. A Fact Table is a primary table in each dimensional model that is meant to contain measurements of the business.


2006.02.27

Enlarge
A Star Schema is consist of a central large dominant table which is joined by other smaller tables. The central table is a fact table and the surrounding tables are dimension tables.





2006.02.26
Business Intelligence (BI) is a term with no one definition - some use it interchangeably with Decision Support Systems (DSS), many have made up their own definition based on this. We use the following: a Business Intelligence system is very like the "intel" of the spy business. It couples third party competitive and market information with internal data sources to provide input to the DSS.


2006.02.25
Data Mart is a database that is optimized for storage and analysis rather than transactions, but is focused to solve a specific business need (usually functional or geographic). For example, an Integrated Customer Data Mart holds all customer information whether it comes from sales, order fulfillment or customer support and is used to analyze customer demographics, buying patterns, response to marketing campaigns and the like; a regional data mart may hold all accounting, manufacturing, customer and employee data for the Northeast offices of a company.