In the past decades, there have been two trends. First, linked, in part to Aadhaar, there has been significant digitization of government processes and databases, leading to a lot of administratively generated data, but so far, limited analysis. Second, computing has become inexpensive, due to fall in equipment prices and technologies like cloud computing. This advancement has made it possible to tap the profound potential of very large datasets inexpensively to discover deeper insights.
Data is now available at very granular scale, for both businesses and individuals. There is data on tax payments at the level of invoices, data on links between businesses, data on school achievement of students, on savings behaviour of SHG members, on cash transfers to beneficiaries, a growing method of delivering social support. In Andhra Pradesh, the Gram Sachivalayam and Ward Secretariat (GSWS) database used for the six-step verification for government schemes covers almost the entire population of the state and can be used as a common benchmark for various other data sources. Organizations like APSSAAT are building the capacity to quickly implement large state-wide surveys to supplement administrative data.
Current products like dashboards merely describe data and are limited in their use of data and effectiveness as decision support tools. The question is whether more effective use can be made of this mass of data to support departmental decision-making, by using more sophisticated statistical models and developing a more structured process driven approach to collecting, organizing, analyzing and using data. This document outlines such an approach.
Privacy and Data: As large data is harnessed in a distributed fashion to ensure maximum effectiveness, the risks of breach of privacy for both individual and firms grow. A culture of privacy and non-disclosure of identifiable information must be inculcated in all those who work extensively with data. DAU is committed to developing this culture.