AI researchers detail obstacles to data sharing in Africa

We’re excited to bring back Transform 2022 in person on July 19 and virtually from July 20 through August 3. Join leaders in AI and data for in-depth discussions and exciting networking opportunities. Learn more about Transformer 2022

Artificial intelligence researchers say data sharing is a key part of economic growth in Africa, but faces a number of common hurdles, including the threat of data colonialism. The African data market is expected to grow steadily in the coming years, and the African data center business organization predicts that the African data market will need hundreds of new data centers to meet demand over the next decade.

In an article titled “Narratives and counter-narratives on data sharing in Africa”, the research team exposes structural issues, including but limited to financial or infrastructure issues. The co-authors argue that ignoring the ethical concerns associated with these barriers could cause irreparable harm.

“Currently, a significant portion of Africa’s digital infrastructure is controlled by Western tech powerhouses, such as Amazon, Google, Facebook and Uber,” the paper said. “Traditional colonial powers pursued colonial invasion with justifications such as ‘educating the uneducated’. Data accumulation processes are accompanied by similar colonial rhetoric, such as “liberate the bottom billion,” “help the unbanked,” “connect the unconnected,” and use data to “jump poverty.” “.

Power imbalances, lack of investment in building trust, and disregard for local knowledge and context are identified as the three most common barriers to data sharing, as “whole heterogeneous geographies of people see their data viewed and shared, but do not reap the same benefits. as data collectors and owners of data infrastructure,” according to the document. The co-authors argue that the dominant narratives around data sharing in Africa today focus on a lack of knowledge about the value of data and often suffer from what the co-authors call deficit narratives: stories that focus on topics such as poverty, unemployment or illiteracy rates.

“In recent years, the African continent as a whole has been seen as a cutting-edge opportunity for building data collection infrastructure. Enthusiasm around data sharing, and in particular machine learning or data science for development/social good, ranges from tempered discussions around new avenues of research to proclamations that “the AI ​​invasion is coming to Africa (and that’s a good thing)‘. In this work, we echo previous discussions that this can lead to data colonialism and significant, irreparable harm to communities.

The co-authors argue that responsible data sharing in Africa should reject practices that lead to data colonialism and focus first on meeting the needs of individuals and local communities. They say it requires awareness and examination of influential issues such as the legacies of colonialism and slavery. They warn that this context can contribute to data policy or practices rooted in Western-centric extractive practices that are “ill-suited to the African context”.

The largest data center in Africa is would be under construction in South Africa. It’s part of a wave of investment in African data centers and telecommunications companies that some have called a gold rush. Microsoft opened its first data center in Africa in 2019. AWS opened a region in South Africa last year. Google is expected to complete construction of the Equiano Undersea Cable later this year, and Facebook is building an Undersea Cable which is expected to be completed in two to three years. Nvidia is also stepping up its operations in Africa.

An analysis of the Rise of the African Cloud by Xalam Analytics revealed that less than 1% of global public cloud revenue came from Africa in 2018.

Above: An illustration of stakeholders in the African data ecosystem in the article “Narratives and Counternarratives on Data Sharing in Africa”

The article draws its conclusions from interviews with African data experts and insights from co-authors, a number of whom grew up in Africa or currently live on the continent. Rediet Abebe grew up in Ethiopia and co-founded Black in AI. Abebe is an assistant professor of electrical engineering and computer science (EECS) at UC Berkeley, the first black faculty member in the school’s history.

Abeba Birhane also grew up in Ethiopia. Currently a Ph.D. A student at the University of Dublin, her writings on relational ethics received the best paper award at the NeurIPS Black in AI workshop in 2018. Birhane has written extensively on algorithmic colonization. Sekou Remy grew up in Trinidad and Tobago but currently works as a Research Scientist and Technical Officer at IBM Research Africa in Kenya. And George Obaido and Kehinde Aruleba are Nigerians and co-authored the article in association with the University of the Witwatersrand in South Africa.

“Data sharing practices that operate in the absence of knowledge of local norms and contexts contribute – albeit indirectly – to the erosion of trust between stakeholders in the data sharing ecosystem,” says the document. “As machine learning and data science focus on the Global South and particularly on the African continent, the need to understand the challenges that exist in data sharing and how we can improve practices in data becomes more pressing.”

Power plays a major role in data sharing in Africa. For example, research cited in the article found that Africans are significantly underrepresented in the biomedical research community, even when the data comes from Africa.

“Asymmetries of power, historically inherited from the colonial era, are often found in data practices and manifest in various forms, from unbalanced authorship to the unequal bargaining powers that accompany funding,” the paper says. The co-authors add that power imbalance is also a factor in the relationship between project managers and data analysts; data analysts and data collectors; and data collectors and research participants.

The paper also encourages understanding of attitudes towards data among African researchers. Governments in places like Ghana and Kenya have opened data portals, but a survey of South African researchers found that only about one in five share data with others, and one study 2018 involving life scientists in more than a dozen countries in sub-Saharan Africa described a number of barriers to data sharing. That same year, the governments of countries like Botswana, Ethiopia and South Africa developed national data strategies. To solve common problems, the African Union has formed a AI working group in 2019.

“Trust is the fundamental element of all relationships in a data sharing ecosystem,” the document states. “The future of open data management and data sharing and their contribution to the advancement of science and technology in Africa will continue to grow, despite the sluggishness caused by lack of funding, redundant policy frameworks and limited infrastructure.”

The article has been accepted for publication at ACM Fairness, Accountability, and Transparency (FAccT). The virtual conference begins next week. Other papers accepted for publication at the FAccT include research that examines how language models work with word association and censorship and a call for a culture shift in machine learning by the Ethical AI team at Google. and the University of Washington. The FAccT conference was co-founded by Timnit Gebru, the head of Google’s Ethical AI team who was fired at the end of 2020. The conference has a the story to be sponsored by a number of large tech companies that have a poor track record of hiring black researchers, such as Facebook AI Research (FAIR), Google’s DeepMind, and Google.

VentureBeat’s mission is to be a digital public square for technical decision makers to learn about transformative enterprise technology and conduct transactions. Learn more about membership.

James G. Williams