Data Democratization just sounds too buzzy
Do we gain or lose by making data more accessible in our organizations?
I’ve spent almost the entirety of a now 30+ year career in data and analysis. So when I first heard the term data democratization it got filed in my head alongside ‘big data’, ‘internet of things’, ‘artificial intelligence’ and the like. Data democratization is a strong candidate for buzzword hall of fame status. Being someone who generally loathes this kind of jargon, I thought still, maybe I should double-click into this idea. You know, peel back the onion a bit, take it to the next level...
Do we have inalienable rights when it comes to data?
At least one definition of data democratization seems to be that anybody should have access to any data with no barriers, anytime, for whatever reason. Presumably you would not limit the potential for good data insight to come from anywhere or anyone within your organization. Or am I confusing this with the definition of data anarchy? I mean, why shouldn’t anyone who wants to practice medicine be able to? Aren’t we limiting the potential for medical innovation by having laws in place that require you to have certain qualifications before you can prescribe someone a controlled substance? Don’t get me wrong, a Psilocybin Data Science summit would be fantastic — no telling what would come out of that.
Let’s walk it back a second and bring in the democracy side. Maybe we do have certain inalienable rights when it comes to data. Could we provide for a common data security, promote the general use while securing the blessing of insights and reconciled books for our institutions?
Without so many laws and regulations we likely would see more innovation, growth, opportunity but it might also come with a cost none of us are willing to pay. Thus the balance. How much government is too much? Too little? Like every citizen, every business analyst would prefer as much freedom (and data) as they can get but those same analysts would almost universally agree they don’t want dirty data. They want data quality; accuracy, completeness, relevance. Otherwise they’re spending untold hours verifying, cleaning, conforming; ain’t nobody got time for that. So truth be told, we do want some laws, some regulation in place to make our lives easier.
As the number and complexity of our laws grow, so do the number of attorneys. It’s been the same with data. Ever since the ‘three V’s’ (volume, velocity, variety) became a thing, the number of business intelligence tools and software and the salespeople to go with them has proliferated. Even the V’s have proliferated, I think we’re up to 5 now. All with good intention, even the software salespeople. But is it a state we just have to accept as too big, too complex for the end data consumer to have a voice and the freedoms they long for?
Again, maybe there’s something here with the whole democracy and system of government? We should start at the federal level — we’re talking about our enterprise or institution. At this level you want to have as few laws as possible. Data from source systems should only go through the most basic of data quality checks and conforming. No transformations! Here you’re securing the common defense so mostly you’re concerned about external threats. Protect the borders, be scrupulous about what comes in and what goes out. Inside the border, federate the data. Don’t make your citizens have to visit three different states just to put together a casserole. Three sources of sales data get put into one. Customer data, all in one place.
Then we get down to the state and local levels. Here you might have a few more laws or rules because you’re further conforming and transforming the data to a functional data mart. But those should only be the rules or laws the local constituency has voted for. Impose unreasonable control from above and either revolution breaks out or your constituency goes underground with shadow IT and rogue databases. Don’t like the laws in your city or state? That’s why you have multiple marts depending on the needs of the constituency.
Now we get down to the citizen. To date you could argue we’ve been trying to run a communist sort of data state. A technology oriented government that knows what’s best for the citizens; assisted in part by large business intelligence tools that only satisfy in part the need to be free and in control of your own data destiny. If it’s too centralized or too much up to a small group what data gets made available then we’re not quite a democracy. Data of the people, by the people…okay so we’ve got data marts still built by the technocrats but let their specifications be locally controlled. And whenever a citizen wants to know where a particular column came from, there’s data lineage to take them all the way back to the source. There’s regular data quality checks and reports so they can be confident in what they’re using. Complete transparency. The ultimate test is whether or not the business analyst feels empowered to do their job, unfettered.
Start with individual liberties. What will our new system secure and hold sacred for the end data consumer?
There should be a congress elected by the people. Maybe we call it a Data Governance Council to be buzzy. They decide the federal laws if and only if the people say they need one. ‘We need an enterprise definition and transformation for revenue per customer’. The council legislates it, the technocrats enforce it, the citizens see it in their marts. Marketing wants a different definition of revenue per customer. No problem, marketing gets their definition in their mart as well as being able to see and use the enterprise-sanctioned definition; local control, national unity.
I had a boss once say to me “This ain’t no democracy” in reference to some edict he wanted carried out. Fine. But if it ain’t no data democracy then it probably isn’t a place where [data] science and freedom of insight exploration can be had. Worse, it might be a place where regulatory violations go unnoticed until it’s too late and very costly.
Making it happen requires drawing on the best examples from a democratic government. Start with individual liberties. What will our new system secure and hold sacred for the end data consumer? Reliability, choice, traceability, transparency, quality? Build from the basic rights. Then build in mechanisms for everyone to have a voice in how their data is governed. Then build in mechanisms for how they can and must participate to assure their rights are protected. No free rides. Citizens have a responsibility to vote, pay taxes, support those they’ve put in charge. A tax might be that the citizen has to learn a few SQL commands to secure their freedom from unknown transformations and summarizations happening in a dark room at the hands of a detached report developer.
Data democratization is definitely a great buzzword because hearing it makes me think it’s something I need to do without knowing anything about it. But I think I know what I would like it to be and can be and it’s probably something we all need to do now. Vive la révolution!