What You Need to Know to Before You Train on Big Data

Follow us on LinkedIn for our latest data and tips!

Big data is everywhere, and savvy companies have already made it part of their business plans. But not all of them have figured out how those plans incorporate technical training.

John Kidd, an instructor, said there are several important things for learning leaders to consider as they create strategies to provide the right technical training on big data. First, how do they define big data?

Are they talking about batch processing and storing a large volume of organizational data? Or are they talking about stream processing? As data comes in at a high velocity, is how quickly can an organization use it to make business intelligence decisions a concern? “If you have that information, that would be a huge step forward in putting together a learning program,” Kidd said.

A company can’t just hop on a trend. It has to have a business case for any big data implementation. “That dictates quite a bit,” he said. “What are the main goals? Next we have to understand what will they be doing with it? What are they trying to accomplish?”

For instance, a big box retailer might want to use big data technology to better understand user behavior. If user X orders a pair of pants online and continues to shop, what might he look for next? This information could enable them to provide digital recommendations.

He also said when it comes to providing technical training around big data, it’s important to know which vendors a company uses. Cloudera? Hortonworks? matR? Those technical vendors make the software a training provider like DevelopIntelligence will train on.

Also, who is the audience? What is their level of technical proficiency? What is their role in the organization? Any training vendor should customize offerings based on those different learning paths.

“If they’re a developer/engineer vs. a data analyst that will dictate the training,” Kidd explained. “Lay out a path for training all the stakeholders and participants in a big data roll out. For data scientists and people who do data analysis it will be one track. For engineers and technical implementers it will be a different track.”

Helping a company focus training on the right things is extremely important to avoid wasting time and resources. Without identifying an audience’s proficiency level, for instance, an organization might organize and execute a four-day big data training event with the wrong content for the wrong audience. Data scientists and analysts shouldn’t be in the same class as developers or engineers. The former two will find the content too technical and not entirely appropriate for their respective roles.

“That’s just general training common sense,” Kidd said.

The other area of big data that many companies don’t pay enough attention to is governance. Once they have data stored in an implementation or cluster, how will they manage that? How will they maintain the meta information on where the data comes from? How has it changed since it was created?

“Companies need to consider additional training for specific data governance tools,” Kidd said. “Those tools do exist and any company that’s implementing a big data strategy, they better be paying attention to this. It’s going to come up and bite them later on if they don’t.”

Imagine you have a notebook computer with a bunch of files on it. Why are you saving a particular file? Do you really need it? Are you scared to delete the file because you don’t know what it is? Now imagine that uncertainty on an enterprise scale where you’re dealing with terabytes or even petabytes of data.

“That’s a governance issue,” Kidd explained. “Why do we need that data? Where did it come from? Why do we care about keeping it, and how are you going to manage it?”

Companies use big data in a variety of different ways depending on their industry. Some use it to analyze customer behavior, to analyze the health of their internal and external systems, to do fraud detection, there are many use cases. But Kidd said that it’s important to acknowledge that just because a company implements a big data solution that does not mean that every thing becomes a big data problem. “That’s an ancillary topic that I always talk about in a class because if I get a hammer everything becomes a nail.”

That’s why training is important. “Exactly.”

Looking for help with Big Data? Check out our Big Data Academy!