IMRSV, Inc Provides Facial Analytics

Cara is coming to a brick-and-mortar store near you. But don't be insulted when she doesn't recognize who you are.

Recently, I met with Jason Sosa, the CEO and Founder of IMRSV, Inc… twice. What came through to me was his passion for understanding the societal and human impacts of the technologies he creates and brings to market. This passion makes their mantra of and adherence to "privacy by design" very real and central to their approach.

Cara is the core software product from IMRSV, Inc. Cara analyses your face, and determines demographic, attention and emotive statistics about you, without attempting to identify you. As IMRSV states, Cara turns any connected camera into an intelligent sensor, but does so anonymously. Move from one Cara camera to another, or move away and back again to the same Cara camera, and the temporary ID number associated with you changes.

While Cara is pre-launch, I'm excited both by the technology, and by the IMRSV, Inc business model. The business model is very simple, whether a small shop owner or a developer interested in using Cara as part of a sensor analytics ecosystem, you pay $39.95 per camera, which includes the stand-alone Cara software and the Cloud-based data-as-a-service. The possibilities presented by Cara are what really got me going, fueling both an exciting initial briefing and a follow-up four-hour "lunch" and demo.

  • small shop owners now have demographic information available to them that was previously beyond their reach
  • larger organizations can better understand why campaigns in the physical world succeed or fail, just as they can online
  • developers can build attention and emotions into their sensor platforms

What's does any of this mean? Here's a few examples.

  • While Cara doesn't recognize a person, it can say that the person who bought X at the register was a young adult male or the person who bought Y in the drive-through was a senior female, based upon time stamps.
  • Online retailers have long established ways of determining who is buying what online. Brick-and-mortar stores have tried to do the same through loyalty programs. Privacy concerns have affected adoption of loyalty cards. By anonymously providing demographic and attention data, physical stores can glean the same understanding of their customers as the online versions, without violating privacy.
  • End-cap displays at retailers or kiosks at events can provided targeted messages, increasing effectiveness.
  • A simple toy, using a smartphone to draw a face and its camera to record the smile of a child, can smile back.
  • A car can respond to a driver glancing away from the road, perhaps "kicking" the driver in the butt through a vibration motor in the seat.
an image of the Cara software player
The Cara Player

In addition to starting companies, Jason is very interested in the Singularity, and the impending impact of technology upon human employment and self-identity. This has led both to the "Privacy by Design" and "Principles of Good Use" for developers/partners. If you don't believe me, maybe you'll believe Jules Polonetsky.

"Privacy by design solutions are critical to implementing new technologies in a world were data collection has become ubiquitous. Steps that Cara takes such as not collecting any personal information, and not storing, transferring or recorded any images are key to ensuring privacy concerns are addressed as these technologies are rolled out.”
- Jules Polonetsky
- Facebook.com/FutureofPrivacy
- @JulesPolonetsky

There are various pieces of research out there that show that the Internet of Things will be a 15 trillion dollar market right now. By 2020, I strongly believe that there will be over a trillion sensors deployed and that if your "thing" isn't connected, it won't be a viable product. Companies like IMRSV, Inc are providing the ecosystem to allow sensor analytics from everyday objects at very affordable prices. This will push this market even further and faster than the pundits anticipate. So, let me put on my tinfoil hat and stand on my soap box:

  1. Every conceivable facet of everyday life will take advantage of connected data for informed decisions
  2. Current sensor, internet of things, and data management & analytic companies will be assimilated into sensor analytics ecosystems or die
  3. Privacy will be a major concern for those who care, but the majority don't know enough to care. Individuals like Jason and Jules will protect the masses and ensure adherence to privacy by design guidelines.

New Hope from Big Data

Big Data is a catchy phrase. Unfortunately, it is often misused and misunderstood. Often, Hadoop and Big Data are used interchangeably; as if the Apache Hadoop family of projects are the only solutions for Big Data, or that that only use for these projects is from Big Data. Neither is true.

As an EDW/BI practitioner, I watched the Hadoop, or really, the Map/Reduce framework, be embraced and forced into being by software developers who were frustrated by Structured Query Language (SQL) and the need to create Entity-Relationship Diagrams (ERD) as data models or schæmas. They were equally unhappy with the various work-arounds to access Relational Database Management Systems from within their programs, such as Object Relational Models (ORMs) and Data Access Objects (DAOs). At first, I felt that these developers were simply lazy.

However, as I worked more with these so-called NoSQL technologies, it helped to clarify the dissatisfaction that I felt during the years I was leading EDW and BI projects. Thirty years ago, I worked in Aerospace System Engineering, developing methods and algorithms for risk assessment using Bayesian statistics. But, by 1996, I became involved in my first EDW project. Since then, the actual structure and functions associated with the data - defined by the data, became less important than fitting the data into an artificial structure imposed by business process models.

Don't get me wrong. Relational algebra, relational calculus and the DBMS technologies that came out of this mathematics, are all very useful. And, in the right hands, SQL is a very powerful language. ERDs provide a wonderful way to map data to business processes and to both transactional and analytic systems.

But… There is so much more that can be done with the data coming from traditional human-to-machine (H2M) interactions, but increasingly from human-to-human (H2H), machine-to-machine (M2M) and machine-to-human (M2H) exchanges. The interweaving of the flows of data from such disparate sources is what drives my research today.

  • Gamification driving the adoption of smart meters for utilities
  • Self-quantification use cases in the workplace
  • Sustainability for increasing the bottom line
  • Combing social media and sensor data for profitability
  • Sensor analytics as an ecosystem

These, and over 70 other use cases that I'm cataloguing, come from the innovation surrounding hype of Big Data, and the Data Science movement. In a recent Quark, I've classified this innovation into 11 areas. A compete mindmap is linked from the initial mindmap shown below, and in the report.

A Mind Map of the 11 Big Data Innovation Trends
A Mindmap of the 11 Innovation Trends from Big Data

The Quark covers the trends coming from these innovations, and develops the four keys required to bring valuable decision making processes into your organization from these innovations. It's entitled "Big Data: It's Not the Size, It's How You Use It". For such a simple report, it took over 8 months to develop. Mostly this delay was caused by the fast-paced evolution of the innovations. The executive summary from the Quark is linked from the title.

I hope that you find that information, as well as the mindmap, useful in incorporating inference, prediction, insight and performance with intuition for making better decisions.

DataGrok

For all the silliness surrounding Big Data and Data Science, all the hype and all the controversy, there are actually very innovative and disruptive technologies coming from this area, this new approach to data management and analytics [DMA]. How do we categorize the vendors or the technologies that have never existed before?

Predictives

One new area is Predictive Analytics, also called Predictive Intelligence. Since predictions are not analytics, as the term is used in BI, and certainly not the Intelligence used in BI, I don't like either, but prefer the simpler "Predictives". Four companies with which I've had briefings, fall into the Predictives category, but each of these companies have very different approaches and technologies for performing predictives. These companies are Opera Solutions, Alpine Data Labs, INRIX and Zementis. There are other companies that I'll include in a full report after receiving briefings, such as KXEN, Soft10 and Numenta. By the way, Numenta's product is named "Grok". Given their differences, do they really all belong in the same category?

Opera Solutions: Acting on petabytes of data, Opera Solutions provides a signal hub stack starting with data management, going through pattern matching in the signal layer, and, enhanced by their own Data Science teams, resulting in predictions and inferences for better decisions for enterprise advantage, understanding the "signal" is more important than the underlying technology, to actually create front line productivity through signals manifesting and adjusting "gut feel" where machines don't direct humans but do the heavy lifting.

Alpine Data Labs: Alpine Data Labs brings mathematical, statistical and machine learning predictive methods to the data in situ, no matter how small nor how big the data sets, within a variety of RDBMS technologies and Hadoop distributions. Alpine Data Labs helps data science teams address the data where it lays, across data types and functional areas, working with all the data to bring insight to bear on better decisions.

INRIX: INRIX data science teams and technology provides unique predictives using connected cars, connected devices and connected people.

Zementis: Zementis brings predictive modeling into decision management through their data science teams, Adapa product and strong commitment to the predictive markup modeling language [PMML]. Through partners and customers Zementis works with traditional and innovative data sources to provide decision management from predictives, data mining and machine learning for marketing solutions, financial services, predictive maintenance and energy/water sustainability.

DataGrok

One of the more interesting things to come out of data science is how do you really understand the data that is being gathered and presented. Two of the companies with which I've recently have had briefings, challenge the categories of Data Discovery or Data Exploration. However, each of these companies have different technologies, and different approaches to fully, deeply understanding your data, and to being able to draw conclusions from the data before doing other, more formal analytics. Over the past month, I've had the good fortune of having very in-depth, in-person briefings by both of these companies. Both of these companies are helping those who need it most to truly, fully, deeply, easily understand their data. These approaches, while very, very different, both constitute an entirely new category. Beyond data discovery, beyond data exploration, I call this new category Data Grokking.

"Grok" as I wrote in 2007, means to

"to fully and deeply understand"; [but to you need some background on the word's origins]. It's Martian and not from any Terran language at all. It comes from the fertile mind of Robert A. Heinlein, and was brought to Earth by Valentine Michael Smith in Heinlein's wonderful 1961 novel Stranger in a Strange Land.

One of these companies is still in stealth mode, and I won't mention their name here. The other is Ayasdi, and Ayasdi takes a very, very interesting approach to grokking your data.

These two very different technologies, based upon very different science and mathematics, do indeed allow us to fully and deeply understand our data. Much like the Martian ceremony, the DataGrok allows us to mentally ingest our data, to realize creative insights from our data sets, and to recognize the fundamental interweaving among the data, that, prior to these two innovative firms, could only come about through a long, arduous struggle with the data sets.

As I mentioned, the one company is still in stealth mode, so I'll write about Ayasdi here.

Ayasdi

Ayasdi comes out of the intersection of Topology and Computer Science, as brought together by a Stanford Professor, Gunnar Carlsson, and Gurjeet Singh. The project started as a DARPA contract that has spanned more than four years, comptop. The CompTop project included Duke, Rutgers & Stanford nodes. Topological methods discover the structure of the data - this is somewhat analogous to, but not the same as the probabilistic or cumulative distribution or density functions [pdf, PDF, cdf or CDF].

Ayasdi is focused on four markets:

  1. Pharmaceuticals, Healthcare and Biotech
  2. Oil & gas
  3. Government
  4. Financial Services

From this, you can see that Ayasdi customers go after expensive data, i.e. expensive to collect, expensive to use. Iris is the front end to the Ayasdi Platform, and while available as a private cloud, their offering is primarily SaaS.

The analyst community is trying to figure out where to put Ayasdi, thus my category of DataGrok. Another area of confusion is "What is the right tool of each step of the process from DataGrok to inferences and predictions?" Some of this stems from mistrust of machines, but we need machines that do more than count and sort, we need machines that help us to find insight and improve performance.

Numb3rs Protyped Data Science

Data Science is a new term and a new job title that has been receiving quite a bit of hype. There have been arguments over the definition of this term, and whether or not it truly describes a new field of endeavor or is just an offshoot of statistics, software programming or business intelligence. Another take is that many of the definitions of a Data Scientist can be met by few if any individuals, but really define a team. The television show Numb3rs ran with new episodes in the USA from 2005 through 2010. In many ways, the show was a prototype for such a Data Science team. Let's look at the roles on the show, and see how they might translate into your organization.

The Mathematician or Statistician

Charlie Eppes, boy genius, who grew into a young professor of Mathematics at the fictional CalSci. Like so many professors, he supplemented his income by consulting. In his case, to the FBI and his brother, applying mathematics and statistics to solving crimes of all sorts. His breadth and depth of knowledge was remarkable; unlikely to be matched in the real world. However, an applied mathematician or statistician, with knowledge of a branch of mathematics or statistics relevant to your problems is essential in either a Data Scientist or a data science team.

Computational Statistician, Computer Scientist or Software Developer

Amita Ramanujan, a student at CalSci who achieves her doctorate in computational mathematics, becomes a professor at CalSci, helps with the consultations to the FBI and, as a side note, dates her one-time thesis advisor, Charlie Eppes. In many ways, Amita is the closest to being a Data Scientist of anyone in the show. Equally adept at mathematics, physics, statistics and hacking, Amita often acquires the required data from disparate sources, and transforms Charlie's mathematical visions into working code. If you hire an Amita, you might just have all you need to get Data Science producing real solutions for you.

Over time, both Charlie and Amita gain a fairly impressive domain knowledge of criminalistics.

Subject Matter Expert (SME)

And speaking of subject matter experts, this is where Numb3rs really prototyped what is required to make Data Science valuable in solving the crimes, er, problems at your organization. There were many SMEs, both as regular characters, and as special roles for specific shows. The regular characters included the FBI agents, from Don Eppes, the lead agent, and Charlie's brother, to his team, with David Sinclair and Colby Granger surviving the entire series. By the end of the series, these two FBI agents were taking turns suggesting and explaining mathematical approaches. Other FBI agents who were on the team for one or more seasons include Terry Lake, Megan Reaves, Liz Warner and Nikki Betancourt. Megan was also a profiler.

Also of important note, was the Eppes brothers' father, Alan Eppes. Alan was a retired city planner for Los Angeles. His knowledge of building, regulations, and city's byways, processes, neighborhoods and interactions, were often instrumental in understanding the results of Charlie and Amita's calculations.

Another regular was Larry Feinhardt, holding the Walter T. Merrick chair at CalSci. It may surprise you to learn that a theoretical physicist and cosmologist, interested in studying the heavens, string theory and zero point energy was a crucial member of a crime solving data science team, but he was, other than a brief hiatus aboard the International Space Station.

More important than these regular cast members however were the guest SMEs. There were some recurring roles, such as the drop-out with a knack for baseball stats, or the mechanical engineer who was more interested in how things failed than in how to build them. Flame propagation, biology, disease control, cryptology, cognition, gaming, chemistry, forensic accounting and more specialists all are needed at one time or another to solve the crime.

Retrospective

For me, the lesson is that while you may find an individual who can be creative with the right math or statistics, find, extract and massage data, have sufficient domain expertise and write sophisticated code, turning their creative algorithms into real solutions, you're more likely to need a team. And even that team will need additional help, from within the organization or outside consultants. Your regular team and "guest stars" mathematicians, frequentists, Bayesians, engineers, scientists, accountants, business analysts, and others to bring the best decisions out of your data.

And if you want to check out how realistic the mathematics was in Numb3rs, check out The Math of Numb3rs from Cornell University. For more on the cast, characters and show, look for the CBS official Numb3rs site, and, of course, the Wikipedia article on Numb3rs.

Update: 20120722: I'm honored by the mentions and retweets on Twitter from those who love the Numb3rs analogy. Thank you all.
.

Comment to BBBT Blog on Wherescape

Today started for me with a great Bouder BI Brain Trust [BBBT] Session featuring WhereScape and their launch of WhereScape 3D [registration or account required to download], their new data warehouse planning tool. Other than my interest in all things related to data management and analysis [DMA], the WhereScape 3D tool is particularly interesting to me in its potential for use in Agile environments and its flexibility in being used with other data integration tools, not just WhereScape Red. Richard Hackathorn does a great job describing WhereScape 3D, which launched in beta at the BBBT, complete with cake for those in the room, which he's already downloaded and used. [I'm awaiting the promised cross-platform JAR to try it out on my MacBookPro.]

Unfortunately, Twitter search is letting me down today as I normally gather all the #BBBT tweets from a session, send them to Evernote, and check these "notes" as I write a blog post.

WhereScape 3D is a planning tool, allowing a data warehouse developer to profile source systems, model the data warehouse or data mart, and automagically create metadata driven documentation. Further, one can iterate through this process, creating new versions of the models and documentation, without destroying the old. The documentation can be exported as HTML and included in any web-based collaboration platform. So, there is the potential of using the documentation against Scrum style burn down lists and for lightweight Agile artifacts.

WhereScape 3D and Red come with a variety of ODBC drivers, and, with the proper Teradata licensing, the Teradata JDBC driver as well. One can also add other ODBC and JDBC drivers. However, neither WhereScape product currently allows connections to non-relational database sources. I would find this to be severely limiting, as in traditional enterprises, we've never worked on a DMA project that didn't include legacy systems requiring us to pull from flat files, systems written in Pick Basic against a UniVerse or other multi-value database management system [MVDBMS], electronic data interchange [EDI] files, XML, or java or ReSTful services. In other cases, we're facing new data science challenges of extreme volumetric flows of data from web, sensor and transaction logs, requiring real-time analytics, such as can be had with SQLstream, or stored in NoSQL data sources, such as Hadoop and its offshoots.

Which leads us to another interesting feature of WhereScape 3D: it's designed to be used with any data integration tool, not just WhereScape Red. I'm looking forward to get that JAR file, currently hiding in a MS Windows EXE file, and trying WhereScape 3D in conjunction with Pentaho Data Integration [PDI or KETTLE] and seeing how the nimble nature of WhereScape 3D planning works with PDI Spoon AgileBI against all sorts of data flows targeting LucidDB ADBMS and data vault. Yeehah!

January 2019
Mon Tue Wed Thu Fri Sat Sun
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      
 << <   > >>
We take a system and ecosystem approach to data management and analytics, with a focus on developing Sensor Analytics Ecosystems for the Internet of Things. As Independent Researchers we work with data management and analytics vendors to understand the aspects of IoT data and metadata such as time-series, location, sensor specifications & degradation; we work with IoT vendors to understand their data management and sensor analytics needs; we work with both for adaption to Sensor-Actuator Feedback Loops interacting through the Fog, Edge, Intermediate Aggregation Points, Cloud and Core, with augmenting decisions at every point, and making autonomous decisions as IoT mature through the 5Cs: Connection, Communication, Collaboration, Contextualization and Cognition. We work with Academics, Technology-for-Good, Government and Business Organizations to understand advances in Science, Technology, Engineering, Arti and Mathematics. We filter this information through a framework of Cultural, Regulatory, Economic, Political and Environmental factors to imagine future scenarios that allow our customers to gauge adoption without the hype. We work individually and with partners to develop strategies, define system and enterprise architectures, manage programs and projects, and achieve success with IoT. 37.652951177164 -122.490877706959

Search

  XML Feeds