Being involved with numerous Big Data efforts over the past few years, I’ve seen both successful and not-so-successful deployments. Big Data is definitely exciting and it seems as though everyone wants to get involved these days. The thing is many people tend to focus on the technology rather than the reason for using Big Data in the first place (analytics, btw).
If you are just getting started with Big Data, here are some tips that can help make your effort a successful one.
1) Let the Business Drive the Effort
This is the single most important aspect to achieving success as it ensures that business goals are met. From a design perspective, the business queries should drive the data model, not the other way around. You want to avoid the IT initiated ‘build it and they will come’ approach at all costs. If you are old enough to remember the Enterprise Data Warehouse project (expensive, inflexible), you are lucky…take the opposite approach. If not, just make sure that business is involved from day 1.
2) Temper the Exuberance
Once people find out what you are doing with Big Data, they are going to want to play too. This is a good thing. However, before turning everyone loose on the cool new toys, you want to make sure that you have gone through at least 1 successful deployment before opening the floodgates. This allows you to refine your overall approach, avoid unnecessary rework and determine which technologies/methods work best for your environment.
3) Do not Fall in Love with a Particular Technology
Technology in the Big Data / Analytics space is rapidly evolving. What is hot today may well be obsolete tomorrow. For example, just a couple of years ago, many thought that MongoDB was the answer. Now, we hear companies asking for assistance in migrating their data off of this platform. Today, the darling is Spark, but it is just a matter of time before something better arrives on the scene. Whatever technology you select, make sure that 1) there is a large community behind it and 2) your analytics tools have the ability to run against multiple repositories.
4) Do Not Let the Fiefdoms Distract You
I was always amazed by the fact that departments within a company ran their business analytics from Excel spreadsheets. The more advanced users were using spreadsheets as a full-blown departmental database. Those of us in the IT world were only vaguely aware that these setups existed…I suspect by design, as most business users tend to view I/T as an unnecessary bottleneck. These ‘fiefdoms’ are now replacing their Excel spreadsheets with Big Data tools, some of which will be different from the technology that you have selected. Be aware that this may lead to conflicts (i.e. mine is better than yours). The key is to be patient, stay the course and do not let this get in the way of your Big Data deployment.
5) Use an Iterative Approach to Deploy
I have been a fan of the iterative approach for the majority of my career. In the Big Data world, you will find yourself working with many different components…some of which may not particularly work well together. Having the flexibility to make adjustments as the project evolves, without having a major impact on the schedule, is essential for a successful deployment.
Big Data is a great place to be right now and it will only get better as time goes on. Successful deployment require that you 1) involve the business early on, 2) understand the obstacles waiting for you and 3) use an iterative approach. I wish you the best on your journey!
Regular Planet Mainframe Blog Contributor
Scott Quillicy is the CEO and founder of SQData, an Addison, Texas based software company that specializes in high-performance data movement and changed data capture (CDC) for IMS, VSAM and relational databases. He has over 30 years of database experience and is considered an expert in database replication strategy and deployment. Prior to founding SQData, Scott was involved in the development of IBM’s DB2PM and other database performance tools.