Renewables and Smart Grid

We are currently in, at least, the fourth era of growth and interest in renewable energy. The first two of which I'm aware, in the late 1800's into the turn of that century, and in the 1950's, both concentrated on solar (Photovoltaics and Solar Thermal), with some wind power in the first. The third was during the Carter Administration in the 1970's (famously ending when Ronald Reagan ordered the solar panels off the roof of the White House). Disclosure: I was doing photovoltaic research at SES, Inc (now part of Royal Dutch Shell) as a physicaleletrochemist during this time.

During the recent upswing in interest, investment and installations of renewable energy sources (photovoltaics, solar thermal, wind, wave, tidal, geothermal, biomass, etc.) I've been worried that the bubble would soon burst. But today, I've had a thought that encourages me, that maybe renewables will take their place along side coal, oil and nuclear. The reason for this is complex, more social than technical, more due to business than to science.

Many point to the past failures of renewables, of whatever type, due to inefficiencies and to long periods, or infinite time, for a return on the upfront investment. But I think that much of what prevented adoption of renewables is more for social and business reasons. For the most part, the past marketing effort for renewables was to get people off the grid. This was scary for the individual, not justified by the ROI, and inimical to business interests.

Today however, we have the prospect of the Smart Grid. What exactly defines the Smart Grid is still being debated, but here's my hopeful thought. Just as the Internet evolved to combine data, communication and collaboration protocols into what we now term Web2.0 or read-write-web or social media, allowing anyone who desires to do so, become a producer of content as well as a consumer, the Smart Grid will not force users of renewable energy sources off the grid, but will allow whoever desires to do so become a producer as well as a consumer of utility services, starting with electricity, but perhaps evolving to include other utility services as well. Let me also point out that I'm not [just] talking about the individual, I'm talking about communities and small businesses. For example, the Smart Grid would allow a small business such as our local Coastside Scavengers to install an AdaptiveARC reactor, transforming the waste they pick-up from our homes into electricity, and additional cash flow.

This possibility has social, business and economic implications that the previous generations of renewables lacked. This gives me hope. This also strengthens my desire to see workable standards, and working implementations of the Smart Grid(s) - whatever that turns out to really mean.

Questions and Commonality

In the introduction to our open source solutions (OSS) for decision support systems (DSS) study guide (SG), I gave a variety of examples of activities that might be considered using a DSS. I asked some questions as to what common elements exist among these activities that might help us to define a modern platform for DSS, and whether or not we could build such a system using open source solutions.

In this post, let's examine the first of those questions, and see if we can start answering those questions. In the next post, we will lay out a syllabus of sorts for this OSS DSS SG.

The first common element is that in all cases, we have an individual doing the activity, not a machine nor a committee.

Secondly, the individual has some resources at their disposal. Those resources include current and historical information, structured and unstructured data, communiqués and opinions, and some amount of personal experience, augmented by the experience of others.

Thirdly, though not explicit, there's the idea of digesting these resources and performing formal or informal analyses.

Fourthly, though again, not explicit, the concept of trying to predict what might happen next, or as a result of the decision is inherent to all of the examples.

Finally, there's collaboration involved. Few of us can make good decisions in a vacuum.

Of course, since the examples are fictional, and created by us, they represent our biases. If you had fingered our domain server back in 1993, or read our .project and .plan files from that time, you would have seen that we were interested in sharing information and analyses, while providing a framework for making decisions using such tools as email, gopher and electronic bulletin boards. So, if you identify any other commonalities, or think anything is missing, please join the discussion in the comments.

From these commonalities, can we begin to answer the first question we had asked: "What does this term [DSS] really mean?". Let's try.

A DSS is a set of processes and technology that help an individual to make a better decision than they could without the DSS.

That's nice and vague; generic enough to almost meaningless, but provides some key points that will help us to bound the specifics as we go along. For example, if a process or technology doesn't help us to make a better decision, than it doesn't fit. If something allows us to make a better decision, but we can't define the process or identify the technology involved, it doesn't belong (e.g. "my gut tells me so").

Let's create a list from all of the above.

  1. Individual Decision Maker
  2. Process
  3. Technology
  4. Structured Data
  5. Unstructured Data
  6. Historical Information
  7. Current Information
  8. Communication
  9. Opinion
  10. Collaboration
  11. Analysis
  12. Prediction
  13. Personal Experience
  14. Other's Experience

What do you think? Does a modern system to support decisions need to cover all of these elements and no others? Is this list complete and sufficient? The comments are open.

First DSS Study Guide

Someone sitting in their study, looking at their books, journals, piles of scholarly periodicals and files of correspondence with learned colleagues probably didn't think that they were looking at their decision support system, but they were.

Someone sitting on the plains, looking at the conditions around them, smoke signals from distant tribe members, records knotted into a string, probably didn't think that they were looking at their decision support system, but they were.

Someone at the nexus of a modern military command, control, communications, computing and intelligence system, probably didn't think that they were looking at their decision support system, but they were.

Someone pulling data from transactional systems, and dumping the results of reports & analyses from BI tool into a spreadsheet to feed a dashboard for the executives of a huge corporation probably didn't think that they were looking at their decision support system, but they were.

The term "decision support system" has been in use for over 50 years, perhaps longer.

  • But what does this term really mean?
  • What do all of my examples have in common?
  • How can we build a reasonable decision support system from open source solutions?
  • What resources exist to help us learn?

I'm starting a series of posts, essentially a "study guide" to help answer these questions.

I'll be drawing from and pointing to the following books and online resources as we install, configure and use open source systems to create a technical platform for a decision support system.

  1. Bayesian Computation in R by Jim Albert, Springer Series in UseR!, ISBN: 0-38-792297-0, Purchase from Amazon, you can also purchase the Kindle ebook from Amazon
  2. R in a Nutshell by Joseph Adler, ISBN: 0-59-68017-0X, Purchase from Amazon
  3. Pentaho Solutions; Business Intelligence and Data Warehousing with Pentaho and MySQL, by Roland Bouman and Jos van Dongen, ISBN: 0-47-048432-2, Purchase from Amazon
  4. Pentaho Reporting 3.5 for Java Developers by Will Gorman, ISBN: 1-84-719319-6, Purchase from Amazon
  5. Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration by Matt Casters, Roland Bouman & Jos van Dongen, ISBN: 0-47-063517-7 due 2010 September, Pre-Order from Amazon
  6. Data Mining: Practical Machine Learning Tools and Techniques by Ian H. Witten and Eibe Frank, Second Edition, Morgan-Kaufmann Series in Data Management Systems, ISBN: 0-12-088407-0 a.k.a. "The Weka Book", Purchase from Amazon, Pre-Order the Third Edition, you can also purchase the Kindle ebook from Amazon
  7. LucidDB online documentation
  8. Pertinent information from Eigenbase
  9. LudidDB mailing list archive on Nabble
  10. Anything I can find on PAT
  11. Pentaho Community Forums, Wiki, WebEx Events, and other community sources
  12. R Mailing Lists and Forums
  13. Various Books in PDF from The R Project
  14. Information Management and Open Source Solution Blogs from our side-column linkblogs

In this study guide series of posts:

  • I'll show how the datawarehousing (DW) and business intelligence (BI) can be extended to include all the elements held in common from my DSS examples.
  • We'll examine the open source solutions Pentaho, R, Rserve, Rapache, LucidDB and possibly Map-Reduce & Key-value-stores, and the related open source projects, communities and companies in terms of how they can be used to create a DSS.
  • I would like to add a collaboration tool to the mix, as we do in our implementation projects, possibly Mindtouch, a ReSTful Wiki Platform.
  • We may add one non-open source package, SQLStream, that's built upon open source elements from Eigenbase. This will allow us to add a real-time component to our DSS.
  • I'll give my own experience in installing these packages and getting them to work together, with pointers to the resources listed above.
  • We'll explore sample and public data sets with the DSS environment we created, again with pointers to and help from the resources listed.

The purpose of this series of posts is a study guide, not an online book written as a blog. The goal is to help us to define a modern DSS and build it out of open source solutions, while using existing resources.

Please feel free to comment, especially if there is anything that you feel should be included beyond what I've outlined here.

Modeling and Predictives

Here's a personal perspective and a bit of a personal history regarding mathematical modeling and predictives.

The 1980s were an exciting time for mathematical modeling of complex systems. At the time, there were two basic types of modeling: deterministic and stochastic (probability or statistics models). Within stochastic modeling, traditional statistics vs. Bayesian statistics was a burgeoning battleground. Physical simulations (often based upon deterministic models) were giving way to computer simulations (often based upon stochastic models, especially Monte Carlo Simulations). Two theories were popularized during this time: catastrophe theory and chaos theory; ultimately though, both of these theories proved incapable of prediction - the hallmark of a good mathematical model. A different type of modeling technique, based upon relational algebra, was also moving from the theoretical work of Ted Codd, to the practical implementations at (the company now known as) Oracle: data modeling.

Mathematical models are attempts to understand the complex by making simplifying assumptions. They are always a balance between complexity and accuracy. One nice example of the evolution of a deterministic mathematical model can be found in the Ideal Gas Laws, starting with Boyle's Law to Charles' Law to Gay-Lussac's Law to Avogadro's Law, culminating in the Ideal Gas Law, which all of saw in high school chemistry: PV=nRT.

Mathematical models are used in pretty much all fields of endeavor: physical sciences, all types of engineering, behavioral studies, and business. In the 1970's, I used deterministic electrochemical models to understand and predict the behaviour of various chemical stoichiometry for fuel cells and photovoltaic cells. In the 1980's, I used Bayesian statistics, sometimes combined with Monte Carlo Simulations to predict the reliability and risk associated with complex aerospace, utility and other systems.

The most popular use of Bayesian statistics was to expand the a priori knowledge of a complex system with subjective opinions. Likely the most famous application of Bayesian Statistics, at the time I became involved with the branch, was the Rand Corporation's Delphi Method. There was actually a joke in the Aerospace Industry about the Delphi Method:

A team of Rand consultants went to Werner von Braun to seek the expert opinion of the engineers working on a new rocket motor. The consultants explained their Delphi Method thusly. Prior to the first static test of the new rocket motor, they would ask, separately, each of the five engineers working on the new design their opinion of the rocket's reliability. Their opinions would form the Bayesian a priori distribution. After the test, they would reveal the results of the first survey and the test results, and ask the five engineers, collectively, their new opinion of the rocket's reliability. This would form the Bayesian a posteriori, from which the rocket's reliability would be predicted. Doctor von Braun said that he could save them some time. He gathered his team of rocket engineers, and asked them if they thought that the new rocket motor would fail. Each answered, as did Doctor Von Braun, "no" in German. "There, you see, five nines reliability, as specified." declared the good Doctor to the Rand consultants, "No need for any further study on your part."

Yep, it's a side splitter. :))

I didn't like this method, and did things a bit differently. My method involved gathering all the data for similar test and production models, weighting each relevant engineering variable, creating the a priori, fitting with Weibull Analysis, designing the Bayesian mathematical conjugate, using a detailed post-mortem of the first and subsequent tests of the system being analyzed, updating and learning as we went, to finally predict the reliability and risk for the system. I first used this on the Star48 perigee kick motor, and went on to refine and use this method for:

  • a variety of apogee and perigee kick motors
  • several components of the Space Transportation System
  • the Extreme Ultraviolet Explorer
  • Gravity Probe-B
  • a halogen lamp pizza oven
  • a methodology for failure mode, effects and criticality analysis of the electrical grid
  • and many more systems

This was, essentially, a combination of empirical Bayes and Bayesian weighted variables. Several of my projects resulted in software programs, all in FORTRAN. The first was used as a justification for a 1 MB [no, not a mistake] "box" [memory] for the corporate mainframe. NASA had sent us detailed data on over 4,000 solid propellant rocket motors. Talk about "big data". &#59;) I had a lot of fun doing this into the 1990's.

The next paradigm shift, for me personally, was learning data modeling, and focusing on business processes rather than engineering systems. Spending time at Oracle, including Richard Barker and his computer aided system engineering methods, I felt right at home. Rather than Bayesian Statistics, I would be using relational algebra and calculus for deterministic mathematical models of the data for the business processes being stored in a relational database management system. I very quickly got involved in very large databases, decision support systems, data warehousing and business intelligence.

I was surprised, and, after 17 years, continue to be surprised, how few data modelers agree with the statement in the preceding paragraph. I'm surprised how few data modelers go beyond entity-relationship diagrams; how few know or care about relational algebra and relational calculus. I'm amazed how few people realize that the arithmetic average computed in most "analytic" systems is a fairly useless measure of the underlying data, for most systems. I'm amazed that BI and analytic systems are still deterministic, and always go with simplicity over accuracy.

But computer power continues to expand. Moore's Law still rules. We can do better now. Things that used to take powerful main frames or even supercomputers can be done on laptops now. We no longer need to settle for simplicity over accuracy.

More importantly, the R Statistical Language has matured. Literally thousands and thousands of mathematical, graphical and statistical packages have been added to the CRAN, Omegahat and BioConductor repositories. Even the New York Times has published pieces about R.

It's once again time to move from deterministic to stochastic models.

Over the next few weeks, I hope to post a series of "study guides" that will focus on setting up a web-based environment consolidating SQL and MDX based analytics, as expressed in Pentaho and LucidDB open source projects, with R, and possibly SQLStream. Updated 20100314 to correct links (typos). Thanks to Doug Moran of Pentaho for catching this.

There have been many articles as well on "Big Data". As I commented on Merv Adrian's blog post request for "Ideas for SF Big Data Summit":

One area of discussion, which may appear to be for the “newbies” but is actually a matter of some debate, would be the definition of “big data”.

It really isn’t about the amount of data (TB & PB & more) so much as it is about the volumetric flow and timeliness of the data streams.

It’s about how data management systems handle the various sources of data as well as the interweaving of those sources.

It means treating data management systems in the same way that we treat the Space Transportation System, as a very large, complex system.

-- Comment by Joseph A. di Paolantonio, February 1, 2010 at 4:09 pm

I believe this because there is a huge amount of data about to come down the pipe. I'm not talking about the Semantic Web or the pidly little petabytes of web log and click-through data. I'm talking about the instrumented world. Something that's been in the making for ten years, and more: RFID, SmartDust, ZigBee, and more wired and wireless sensors, monitors and devices that will become a part of everything, everywhere.

Let me just cite two examples from something that is coming, is hyped, but not yet standardized, even if solid attempts at definition are being made: the SmartGrid. First, consider the fact that utility companies are distributing and using smart meters to replace manually read mechanical meters at homes and businesses; this will result in thousands of data points per day as opposed to one per month PER METER. The second is EPRI's copper-riding robot, as explained in a recent Popular Science. Think of the petabytes of data that these two examples will generate monthly. [Order the Smart Grid Dictionary: First Edition on Amazon]

The desire, the need, to analyze and make inferences from this data will be great. The need to actually predict from this data will be even greater, and will be a necessary element of the coming SmartGrid, and in making the instrumented world a better world for all of humanity.

Now if only we can avoid the likes of Skynet and Archangel.

An Open Source Childrens Story

On Twitter today, Lance Walter asked me to go into the Ark Business with him, and Gareth Greenaway asked for entertainment. It must be a rainy Friday afternoon &#59;)

I'm not sure about Lance's offer, but I did tell Gareth the following story, from tweet-start to tweet-end. This isn't word for word as I tweeted. 'Tis a bit expanded, but the tale is the same.

Once upon a time there was a young penguin named Tux. Tux decided to set off on a journey through IT Land. Now IT Land is a dangerous place, full of hackers fighting crackers, and ruled by those in the Ivory Tower and the acolytes of the Megaliths.

Along the way, the adventurous Tux met the Dolphin, the Elephant and the Beekeeper. They made a pact on the Lucid glyph to become a Dynamo of IT, bringing power to the datasmiths of the Land.

They met many Titans from the Megaliths on their Quest. The Beekeeper used the open source bees to open the scrum along the way, blocking the hookers with their sharp claws.

Some of the Titans were helpful, some, not so much.

The Dolphin was empowered by the Sun. But the Sun was consumed by a powerful Oracle. The Elephant, too, gained a powerful ally, and they do Enterprise against the Oracle. The band of the Quest was broken, and Tux was sad.

The Era of Lucid thought ended, but the Dynamo yet powers the Lucid Glyph, and Tux can rely on the Dynamo and the Beekeeper to predict a future clear of the Oracle.

And thus this quest ends, but another soon begins, where Tux will meet new friends and new foes. Will Beastie and the dæmons be allies? Will the Paladin in the Red Hat be stalwart?

Perhaps we'll find out at OSCON, for Gareth suggested that an assemblage of geeks would enjoy this story, and we'll see if OSCON thinks our tales worthy of a keynote slot in 2010.

Do you recognize all the characters in this tale? Maybe the links will help.

What say you, OSCON? Would these tales make a worthy Keynote?

July 2020
Mon Tue Wed Thu Fri Sat Sun
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
 << <   > >>
The TeleInterActive Press is a collection of blogs by Clarise Z. Doval Santos and Joseph A. di Paolantonio, covering the Internet of Things, Data Management and Analytics, and other topics for business and pleasure. 37.540686772871 -122.516149406889



The TeleInterActive Lifestyle

Yackity Blog Blog

The Cynosural Blog

Open Source Solutions


The TeleInterActive Press

  XML Feeds


Our current thinking on sensor analytics ecosystems (SAE) bringing together critical solution spaces best addressed by Internet of Things (IoT) and advances in Data Management and Analytics (DMA) is updated frequently. The following links to a static, scaleable vector graphic of the mindmap.

Recent Posts

Bootstrap CMS