9:30-10:00 Data Science Transforming Society

Presenter: Prof. Colin Wright

Prof. Colin Wright

Prof. Colin Wright pursued an academic career in Computational Mathematics—including his PhD, research, teaching and PhD graduate supervisions. On retiring as full professor of Computational Mathematics at the University of the Witwatersrand, Johannesburg he was granted the rank of Emeritus Research Professor. His senior academic management career included Head of School(s), Executive Dean of the Faculty of Science, Member of the University Council and its Executive and Finance Committees. He did a stint as Wits executive responsible for Libraries and IT and led University and ICT Strategy formulation initiatives.

On retirement from Wits he was appointed as Research Manager of the Centre for High Performance Computing (CHPC) and subsequently to manage the SA National Research Network (SANReN). In 2008, he successfully motivated for the establishment of the national Very Large Database facility (VLDB) which was initially co-located with CHPC and is now being matured into the Data Intensive Research Initiative for South Africa (DIRISA). Early 2014 he was appointed by the Department of Science and Technology (DST) as Advisor on Cyberinfrastructure to lead the NICIS implementation.

Since its inception, Colin has been a member of the G8+O5 Research Data Infrastructure Working Group which advised internationally on data strategy. During the last few years he has advised the European Commission in different capacities on e-Infrastructure matters, including the EC H2020 Research Infrastructures and e-Infrastructures Advisory Group. He is a member of the SA-EU Expert Team tasked with drafting an Open Science framework for DST.

The field of data science is concerned with extracting knowledge from data; the scientific method provides a reliable means to do so, while the mathematical sciences (mathematics, statistics, computer science, etc.) provide the necessary tools. In recent years there has been an explosion in the volume, variety, and velocity of data. An estimate from IBM suggests that humans generated about 2.5 billion gigabytes of data per day in 2017, implying that approximately as much data was created in that year alone as in all previous years. This will soon be dwarfed by the amount of data the Square Kilometer Array (SKA) is projected to generate when it becomes fully operational. All these data are opening unprecedented opportunities to make advances in data science, especially in the subfield of machine learning. These advances have the potential to revolutionise nearly every area of human endeavor, including research, education, health, business, and government; as well as providing unprecedented opportunities for human progress through achieving the Sustainable Development Goals (SDGs). However, there are many challenges (e.g., data science skills, new regulatory frameworks, etc.) that must be addressed if these emerging opportunities are to be fully harnessed to beneficial effect.