« Logs are the New Sexy! | Main | Find Your HIPAA Violations Before Others Do »
by Dimitri McKay
LogLogic Security Architect
Recently I was quoted in an article on CNet about “Big Data”. Dave Rosenberg made some excellent observations about how Big Data is being handled, and spotlighted some companies that are developing FOR Big Data.
But it got me thinking…Do most people really understand what Big Data is?
Big Data is a phrase becoming increasingly more popular. It’s a statement which implies that we’re moving from the Terabyte age to the Petabyte age. It has become the latest challenge for large enterprises and government. It’s not just a buzz word. It’s a real problem that IT departments everywhere are struggling with. And storage isn’t the hardest part of Big Data. In fact, storage is easy. We have the ability to store petabytes and exabytes of data today. But making SENSE of that data…that is the real challenge.
Big Data, as with most quantifications, is a relative term.
How do you know when you have Big Data? Here’s how. If you have to ask yourself “How are we going to store this, organize this and manage this? How are we going to get information out of this that’s useful?”...then you have Big Data.
Martin Wattenberg, a mathematician and computer scientist at IBM's Watson Research Center in Cambridge, Massachusetts says, “You can talk about terabytes and exabytes and zettabytes, and at a certain point it becomes dizzying. The real yardstick to me is how it compares with a natural human limit, like the sum total of all the words you'll hear in your lifetime. That's surely less than a terabyte of text. Any more than that and it becomes incomprehensible by a single person, so we have to turn to other means of analysis: people working together, or computers, or both.”
And he’s right. The more you have, the harder it is to work with. But, if analyzed, you can glean incredible information.
Data on a corporate network, whether it be database data, tons and tons of flat files, or even log data is often unstructured and hard to make sense of. For some, this is a nightmare. The capture and storage of mass amounts of data is a thorn in the side of the average CTO. But on the academic side, on the research side, on the private sector side – this data is a goldmine. Being able to trend events over time, to build predictive models, and to index the entire internet... that’s big. To use it as a performance tool and to identify throughput and use cases... that’s big. Big Data then becomes a decision making tool.
But what caused this?
Over time, disk prices dropped as data storage requirements went ever skyward. And with the advent of cheap storage, the need to delete that data went down. With more and more data being stored and going online every day, suddenly the focus shifted to data security. How do we protect our data? How do we know if our data has been stolen? If it’s been stolen, who stole it?
Before we knew it...storing data for the sake of forensics was on the rise, and after a rash of IP and user data thefts, compliance from the Payment Card Industry kicked in, as did the scourge of all public companies.... compliance to Sarbanes Oxley (SOX). Soon HIPAA grew some teeth in the healthcare industry, and ISO17799 came into effect. All of these mandates required audit trails for a period of time from three months to seven years. That’s when the log data piece of Big Data became a major part of the pie. Think about it. We’re talking about the storage of every log message from every device on a corporate network for up to seven years!
NOW we’re talking about BIG DATA.
Soon you may find yourself asking, “How are we going to store our data, organize our data and manage our data? How are we going to get information out that’s useful?”
It’s at that point you’ll realize that you too have Big Data.
Posted September 23, 2009 in | Permalink
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | |
| 7 | 8 | 9 | 10 | 11 | 12 | 13 |
| 14 | 15 | 16 | 17 | 18 | 19 | 20 |
| 21 | 22 | 23 | 24 | 25 | 26 | 27 |
| 28 | 29 | 30 | 31 |