In a previous post, I said I’d talk about how to tune your dashboard without doing a bunch of analysis the hard way.
As with the dashboard, it may not help you to know what works for my business, but there are some common themes, and I’ll dig into those here.
Know Your Goal
This is the hard part: what do you need to do? You can’t tune toward anything until you have a goal in mind.
You need to clearly articulate your goal and the data set where you think the key information is. I like to do this Mad-Libs style:
I need [what you need to see]
so that I can [what you will learn when you see it].
I’ll know it’s working when [the goal you want to achieve] .
Here are some examples:
I need student data with roster, grade, continuation, and graduation rates so that I can identify early math instructors with the greatest influence over retention. I’ll know it’s working when I can show that retention is improved when using this information vs. naive roster assignment.
Or:
I need order data with customer, date, and SKUs so that I can identify relationships with items in semantically distant categories for our recommendation feature. I’ll know it’s working when out-of-category click-throughs for recommendation panel move by 1.5% or more.
Use The Right Tools
The variety of places you might have useful data can be intimidating. You don’t need to be afraid of this. Just because you can’t predict what will be useful doesn’t mean your data are useless. You can use techniques like clustering to turn big groups of data into small groups or even single points. You can use supervised or unsupervised machine learning techniques to let your system tell you where to look.
You can use tools developed for companies with petabyte-scale data. They work in any company. They’re free and they’re good at this kind of stuff. Let them do the heavy lifting so you can focus on outcomes. Not everything starts with an elaborate theory and proceeds through a carefully constructed experiment.
That being said, you’re probably going to get the best results with a theory and a carefully constructed experiment.
Iterate
The topic of experiment design is too big to cover here, but the takeaway for Big Data in business is that you’re never done. You set out to answer a question and it generates more questions. You get even more data, run more preprocessing, answer deeper questions, get more answers and… ask more questions. But since you’re clear on your goal and how to measure success, things improve over time. This is not a defect. That’s how you know it’s working.