Mary Poppendieck, "Cognitive Bias in a Cloud of Data". Mainly about System 1 and System 2 thinking. Talked about overcoming bias by keeping multiple options open and finding multiple opinions (dissenters.) Not very big-dataish, though she talked about how even big data requies some System 1 thinking (expertise) to design, analyze, and store data.
Dan McCreary (MarkLogic), "NoSQL and Cost Models". Some interesting points about how to get at the Total Cost of Ownership of your database. If a NoSQL database can scale to serve all your applications, that avoids the cost of ETL and duplicate data. Also talked about the need to agree on a standard format for data to avoid quadratic scaling costs as the number of applications increase. Made some remarks about the lower cost of parallel transforms.
Ravi Shanbhag (United Healthcare), "Apache Solr: Search is the new SQL". Basic intro. I found the need to define a schema sort of off-putting, and the idea of "dynamic fields" where you pattern-match on field names even worse. I think we can do better, as the next session showed.
Keys Botzum (MapR), "SQL for NoSQL", about Apache Drill. Pretty good slideware demo showing how to use Drill to run SQL queries across JSON files. Would have liked to see a multi-source demo as well but Drill can handle this. Considering whether to try it out with Tintri autosupports. Definitely the best talk of the day, speaker was enthusiastic and the technology is cool.
Frank Catrine and Mike Mulligan (Pentaho), "Internet of Things: Managing Unstructured Data." This talk made me less interested in the company than dropping by their booth. Tedious explanation of the Internet of Things and lots of generalities about the solution. Gave customer use cases that architecturally all looked the same but repeated the architecture slide anyway. (Surprise--- they all use Pentaho!)
(I never did write up an entry on this year's MinneBar.)