In the first two posts in this series, I used the software on the Oracle Big Data Appliance 3.0 to ingest web log data from the Rittman Mead blog server, parse and load that data into a Hive table, and then join that table to another to add details on post and author. Links to the post in this series are below, and I’ll complete them as the series is posted this week:
Long-term readers of this blog will know that we’ve supported for many years the BI Survey, an independent survey of BI tools customers and implementors. Rittman Mead have no (financial or other) interest in the BI Survey or its organisers, but we like the way it gathers in detailed data on which tools work best and when, and it’s been a useful set of data for companies such as Oracle when they prioritise their investment in tools such as OBIEE, Essbase and the BI Applications.
Here’s the invite text and link to the survey:
Today saw the official launch of the Oracle Database In-Memory Option, with Larry Ellison going through the product features and then reading out some quotes and testimonials from beta testers. Rittman Mead were part of the beta testing program, with several of our team testing out various scenarios where we ETL’d into it, used it with OBIEE and worked out what would be involved in “in-memory-enabling” some of our customer’s BI systems.
Here's a silly little issue I hit this afternoon - mostly a case of "administrator error" but a time waster nevertheless.
In this series of posts, I’m going to be looking at an end-to-end ETL process on Oracle Big Data Appliance, using Hadoop technologies to do the data manipulation and Oracle Data Integrator 12c to orchestrate the process. Over the five posts, I’ll be landing web server log data on the Hadoop cluster using Flume, loading that data into a Hive table, transforming and enhancing the dataset, and then loading the final dataset into an Oracle database using one of the Oracle Big Data Connectors.
A few months ago I posted an article on the ODI12c examples in the new Oracle Big Data Appliance, and over the past few weeks I’ve been writing about the various components within the Cloudera CDH Hadoop stack, including Hive, Pig, Spark and Flume.
I’m pleased to announce that I’ll be presenting at the Oracle Technology Network (OTN) Virtual Technology Summit, an online event running over three days in early July.
We’re sad to announce that Stewart Bryson’s last day with Rittman Mead was last Friday, and he’s moving-on now to a new challenge. Five years ago Stewart co-founded Rittman Mead America with us, and most recently he’s been our Chief Innovation Officer, handing over the day-to-day running of our US business to Charles Elliott, our US Practice Manager.
I want to briefly interrupt my blog series on GoldenGate and ODI – A Perfect Match in 12c… to provide another Data Integration Tip that was found during a recent client engagement.