Re: <nettime> Proposition on Peak Data

John Preston on Sun, 10 Apr 2022 21:26:49 +0200 (CEST)
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: <nettime> Proposition on Peak Data
To: Geert Lovink <geert@xs4all.nl>
Subject: Re: <nettime> Proposition on Peak Data
From: John Preston <nettime@jpreston.xyz>
Date: Sun, 10 Apr 2022 21:17:34 +0200 (CEST)
Cc: a moderated mailing list for net criticism <nettime-l@mail.kein.org>
Dkim-signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1649618254; s=s1; d=jpreston.xyz; h=From:From:To:To:Subject:Subject:Content-Description:Content-ID:Content-Type:Content-Type:Content-Transfer-Encoding:Content-Transfer-Encoding:Cc:Cc:Date:Date:In-Reply-To:In-Reply-To:MIME-Version:MIME-Version:Message-ID:Message-ID:Reply-To:References:References:Sender; bh=zYjIzQ4QuWr5fhrleGL9K705aWMGcjqJgjPl85Yug8c=; b=TLbN/UAXRrww0x7h36DD6Zti9DtbyBLto94x1FbhNaen6zTsXxI43jbdvWpu+lKK qR2HFxT8Iz5320/SZOAMG7YtPJIammrdygTDVmzVl1RxBTN+kmg5kKVZG4rt3C+Z6Sm 3L0w7LqcBBcmyntK6X3AUG0iF3poWtge97P4ES8nRLjDvMHHHa/MTHLRhp2faNXvjJG 6i9Arqicfatrza4do0DbcrSI/DHZj4sMMdXeDhImufFwXPbWIjL3g+9AqQo3eatJJWL MKetT/hiVDwqFVEjx+rC37V9B5TEVanoSV0gwLvb2ZS1pz2MZtFERFq0FrN8tCc2pIQ BTk64cbxyA==
In-reply-to: <B67CD0A4-5352-4C18-B21E-48DF69FFDF40@xs4all.nl>
References: <B67CD0A4-5352-4C18-B21E-48DF69FFDF40@xs4all.nl>
Thanks for your proposition Geert.

I believe I can understand and agree with some of what you are saying, but I think there is a mist here obscuring some of your ideas because you do not present an ontology for data, and so those are my first questions: what is data? what types of data exist?

Also, your email seems to imply that all data are 'bad', or rather that the collection of any data is bad, and so there are axiological and ethical questions as well: are all data bad? which data are bad and which data are not-bad? what makes some data bad and not others?

In terms of ontology, I can share my own perspective as someone who works with data very heavily in my work and research. I would define data as "intentionally gathered information about a thing". From this definition, it is not that data are 'emitted' by things in the world, or that it exists independently of human action, but rather I would argue that data are created/generated by measurement, by the interaction of things with data gathering mechanisms.

For example, we may consider the fuel gauge in a car. The tank contains fuel, and we install a system which can measure the volume of fuel in the tank, and connect this to a dial on the dashboard, and so the data on the current fuel availablity of the car is generated by the interaction of the fuel tank, fuel in the tank, and the measurement system, and it is presented to the driver via the dial.

Another example would be COVID case rates. Individuals are tested -- a swab is taken and a biological assay is run to test for the presence of the virus -- and that generates data to indicate that the individual has, or recently had, COVID. Those data can be aggregated to develop a case rate for the whole population.

I suspect that in your complaints about data, and big data in particular, you are not saying that all data are bad, that cars shouldn't have fuel gauges and that there should be no COVID testing. Instead I would propose that there are certain data, and certain issues around the gathering/generation of data, its ownership, and its processing, which are problematic.

In particular, you discuss "data colonialism" and name "Facebook[,] Google[,] Booking, Uber and Airbnb". In these cases I view the issue being that these companies generate data from and about people without their consent, and then use those data in ways that enrich themselves without remunerating, or even at a cost to, those people (their 'users' or 'visitors').

This is clearly problematic and I agree that this data colonialism (I think it is an apt metaphor) needs to stop. I agree that we should "build firm peer-to-peer (payment) systems and create one, two, many local non-profit cooperatives that focus on distribution as alternatives to, for instance, Amazon". And I am 100% in favour of opening up the black boxes of the algorithms which power the predictions and insights used by governments and companies.

However I cannot agree that we should be "defunding of all data sciences and AI research", and I think the idea that "[data] have not been able to debunk antivaxers and no longer legitimize lockdown regimes" is a false framing. Rather I think that conscientious gathering and analysis of data, with a complete appreciation for the fact that "data are not objective, data comes with interests", is essential for navigating the complex world we live in.

The pandemic is a good example here. In the UK, we used testing to monitor the replication rate of COVID, and modelling to predict the potential growth and impact on healthcare services, and to inform the policy decisions around implementation of measures such as masks and lockdowns. That is not to say that the UK's response was perfect, that it was driven entirely by data, or that the data were objective, but rather that because we had those data we were able to mount a better response than we would have been otherwise.

Additionally, nettimers may be interested in a recent publication from some colleagues of mine where they modelled the impact of earlier lockdowns on COVID casualties across Europe [1].

And so altogether I find it hard to understand your claim that "[people] are aware that the academic funding darling called data science is a racket. After the manic, restrictive Covid years, the big data hype lost its innocence for good."

You also say that "“If a vehicle accident occurs, you can call up the images that the vehicles involved recorded to decide what caused the accident and which algorithms need improvements.” This Forbes quote exemplifies dataism." but I feel to see the problem with this example. If we can generate/gather data on vehicle accidents and put that to good use, then we can prevent terrible accidents in the future. Isn't that a good thing? Consider Andrew Lam's video on road safety barrier design [2]: through controlled experiments, scientists are able to gather data about the effects of barriers on (hypothetical) passengers in car crashes, and develop new barriers which are much safer. Although of course this is not exactly the same because there is a difference between real-time and continuous gathering of data on driving in the real world vs. experiments.

In my ontology of data, I deviate heavily from the definition presented by Jose van Dijck in "Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology" [3], which I reproduce here:

> First  and  foremost,  dataism  betrays  a belief in the objectivity of quantification and in the potential of tracking all kinds of human behavior and sociality  through  online  data.  Secondly,  (meta)data  are  presented  as  “raw  material”  that  can  be  analyzed and  processed  into predictive  algorithms  about  future  human  behavior—valuable  assets  in  the  mining industry.  Let  me  explore  in  more  detail  each  of  these  ontological  and  epistemological  assertions underpinning dataism as a belief in a new gold standard of knowledge about human behavior.

As a self-proclaimed data scientist, I cannot agree that data is objective, or that it is a "raw material". As I state, I view data as being generated by the gathering process, I caution heavily against the idea that data can be uncritically interpreted or analysed without an understanding of their provenance: where they come from and how they were gathered. I know that many of my colleagues have similar concerns, and I hope that this 'dataist' view is in the minority amongst data practitioners. Therefore I think we can have data without dataism, and perhaps the problem is not data itself, but rather a disrespectful or contemptuous approach to working with data by data practitioners and their employers.

[1] https://eprints.soton.ac.uk/454579/
[2] https://www.youtube.com/watch?v=w6CKltZfToY
[3] https://ojs.library.queensu.ca/index.php/surveillance-and-society/article/view/datafication

Thanks,
John (they/them)



8 Apr 2022, 07:38 by geert@xs4all.nl:

>
> “Wenn es um empirische Methoden ging, wurde Adorno grundsätzlich misstrauisch, denn eine Wissenschaft die auf Zählen, Messen und Wiegen reduziert wurde, machte für ihn den Triumph der Verdinglichung perfekt.” Jörg Später (1)
>
>
> Data, the raw material from which information is derived, is stored, copied, moved and modified more easily than ever. This quantum leap reaches levels outside our imagination. Surrounded by sensors, recommendation systems, invisible algorithms, spreadsheets and blockchains, the ‘difference that makes a difference’ can no longer be identified. Big Data is a More Data ideology, driven by old school hypergrowth premisses. As Nathan Jurgenson once observed: “Big Data always stands in the shadow of the bigger data to come. The assumption is that there is more data today and there will necessarily be even more tomorrow, an expansion that will bring us ever closer to the inevitable pure ‘data totality.” (2) Nothing symbolizes the current hypergrowth obsession better than Big Data. Let’s investigate what happens when we apply degrowth to data and reserve datafication–as a decolonial project, a collective act of refusal, an ultimate sign of boredom. We’re done with you, data system, stand out of my light.
>
>
> “Bigness is sameness,” Katherine Behar argues. (3) As a result, we’re facing a declining return on difference. With ever more data—both the good and evil—we’re no longer gathering new insights. Peak data is ahead of us. The vertical axis is not an endless plateau and one day we’ll reach its summit. (4) There are awakenings, also to the wet dream of the technology of limitless petabytes of storage and computing power. Soon, the maximizing of the data flows is reaching its upper limit, much like the cosmic speed limit. This may, or may not be a technical limit. Following the definition of peak oil, we can state that peak data will be the moment when the maximum rate of extractivism is reached and the platform logic implodes, after which a steep decline sets in until systems and their users are outside of the entropy danger zone.
>
>
> Let’s define peak data as the moment at which data extraction reaches a rate greater than any time in the past and then starts to decline. When the territory is already drawn and the costs of ever-more detailed maps are simply no longer worth it, the data gathering machine eventually falls silent. Peak data is related to the distinct concept of data depletion when the moral cost of ‘surveillance capitalism’ outweighs the economic benefit for the few and society as a whole starts to decline because of an excess of social disparity. Once the peak is reached, the presumption that the better the information, the better the decision-making process can no longer be maintained. Dataism itself is a paradigm and the end of its authority is near. Meaningful data chunks no longer provide us with significant differences and we are looking right into the abyss of bit rot. Can we think of peak data in a similar way to the post-digital, or post-data, when excitement over data is now historical, and we no longer associate it with narratives of progress. Today data has become our collective problem.
>
>
> In a variation of Marx, we could speak of the tendency of the rate of meaning to fall. After the peak, the degradation of data will grow exponentially and databases are compromised beyond repair. This is worse than data rot as the religion itself falls apart. But what happens when we can no longer gain a competitive advantage of the gathered data and the crisis of the ‘informed decision’ sets in? Is there a sick logic where peak data produces ever more global rulers who have openly abandoned reason and technocratic process in favour of gut (or phallic) instinct? When everyone has all information, the only surprise is the deliberately uninformed decision. Correlation inflation is real. So is the issue of redundancy. The signal to noise level has never been as low; but then, ever higher computational power can squeeze out more drops of significance… Manipulated at the moment of its capture, fueled by subliminal behavioral interventions and filtered through algorithms, users can no longer easily be fooled. However, many still feel fooled and it’s harder than ever to determine who is the fool. It was for everyone to see, and feel, that life is governed by numbers. People are aware that the academic funding darling called data science is a racket. After the manic, restrictive Covid years, the big data hype lost its innocence for good.
>
>
> As a result of the current platform stagnation, indifference, cynicism, denial, boredom and disbelief are on the rise. We are caught in a turbulent whirlwind of dialectical forces and can no longer make a distinction between drastic techno-determinist forces (such as automation) and the collapse of human awareness, leading to mass depression, refusal and uprises driven by anger, fear and resentment. In a good cybernetic tradition, the technical tipping point of peak data will be both attributed to an out of control army of (ro)bots and the rebel wisdom of a dissident intelligentsia that is both local and planetary.
>
>
> If the paradigm still holds that data is the new oil, the next obvious question should be: when do we arrive at peak data? The thesis here is that peak data will neither be reached because Moore’s Law (the doubling of chip capacity every two years) no longer holds, nor because of a technical ceiling in terms of storage capacity. While it is tempting to hold on to scientific evidence for technological stagnation, peak data will be reached because of inner exhaustion which slowly reaches a critical mass after which the implosion of data hegemony unfolds.
>
>
> According to Katherine Behar data is like plastic. “Big data’s pathological overaccumulation symptomize capitalist excess, like plastic, and big data threatens to bloat a naïve profile into a totality.” (5) Behar uses the obesity metaphor for the never-ending surplus production, one that Jean Baudrillard also used. In The Fatal Strategies, Baudrillard states: “The world is not dialectical – it is sworn to extremes…not to reconciliation or synthesis…we will fight obscenity with its own weapons… we will not oppose the beautiful to the ugly, but will look for the uglier than ugly.” In Baudrillard’s vision, the postmodern subject sits back and enjoys the implosion of big data infrastructure. This simultaneously occurs on the user level when checking routines get forgotten, leaving the hundreds of search results, product recommendations and social media friends for what they are. This is the supremacy of inaction. The problem is that hardly anyone has the guts aka mindfulness to not click, swipe and like.
>
>
> This is not merely a problem of ‘overload’ that can be solved with a periodic reset aimed at the self-imposed diet of the data-gathering agencies. The 2018 > Data Prevention Manifesto <https://dataprevention.net/>>  is a programmatic statement in this respect. Instead of aiming for ‘data protection’ the solution should be allocated at the source: do not collect data in the first place, dismantle the collection devices, delete the software and uninstall the databases. Then reclaim the royal time/space need to make proper decisions. We have the right to refrain and do not need to be told to forget. Don’t be impressed with the legal Gutmenschen that claim to protect privacy. Leave the data for what they are: symbolic waste. Stop data production from happening in the first place.
>
>
> Big data critique had its moment. And it didn’t really come from the media theorists. It was mostly coming from concerned scientists and the enlightened managerial class: pragmatic, reasonable, harmless and predictable. It was big data critique 101: data are not objective, data comes with interests, and so on. The first mistake was to accept the framing: big data. Rarely it was about the size and more about a widespread datafication and the unspoken new grand narrative it supports.
>
>
> Should we use terms such as ‘overcoming the empirical turn’? Can critique enter the inner life of technology or will it be condemned to observe from a distance? This is the key dilemma of radical data critique. Is the ultimate adversary of data-as-such oracles like Byung-Chul Han that work in the tradition of Martin Heidegger? Or should we rather look for 21st-century versions of the Frankfurt School, despite their own involvement in the radio research project and the study on the authoritarian personality? The challenge today is rather banal: tackle the ethics industrial complex. The reformist opposite of data refusal is framed as ‘responsible ethics’. What’s wrong with ‘engaging with actuality’? Whose facts?
>
>
> Let’s shape how the engaging way of becoming critical can be brought into existence in the context of ‘data’. Conditional mediations? Do we need to argue ‘from the inside’? Can we, and should we, make (our view on) data more ambiguous or should we dismiss data as such altogether? Is it possible to transcend data? Ignore? Subvert? Undermine? Sabotage, erasing data with magnets, ransomware? Why is ‘hacking’ not enough? An additional problem is the claim of cognitivism. Not everything is calculable. Let’s not ignore the real existing entropy. How to undermine or prevent the obliteration of life? Why should we optimize our lives in the first place and voluntarily participate in predictive systems that can only create more polarization, anger and anxiety? What’s to be done to escape this machine logic?
>
>
> It is claimed that data have taken over the predictive ability, theory, essays and poetry once possessed. The answer can only be one of radical negation and resistance (‘friction’) against this managerial technological violence. In 2020 Miriam Rasch published > Friction–Ethics in Times of Dataism <https://www.debezigebij.nl/boek/frictie/>> , published in Dutch by De Bezige Bij. In an > English summary <https://www.eurozine.com/friction-and-the-aesthetics-of-the-smooth/>>  of her book written for Eurozine Rasch criticizes the desire for optimization in data science. Data are seen as the next step in science. According to Rasch, this belief is nowhere better exemplified than in Harari’s Homo Deus: A Brief History of Tomorrow. In his pancomputationalism “the universe, plant and animal, human and machine all work in the same way. In comparison with a machine, human beings are hopelessly inefficient and lacking in organization.” Dataism is framed as a religion all subjects of the socio-technological regime believe in. We live inside the data cosmology. The data sphere is here described as a natural monade: dataism is the inevitable paradigm of our times. Rasch believes that “in this mechanistic worldview, extrapolated into a not-so-distant future where we will all function as a computer. In this vision, downfall and progress go hand in hand. Inefficient human faces a certain destiny. Dataism is a cynical faith depicting today’s world as a deplorable intermediate stage on the path to something better.”
>
>
> Data as such are numb, muddy and silent by default. Information does the [not?]talk. The tendency of data to accumulate inevitably ends up in obesity. In line with Vincent van Gogh’s “Real painters do not paint things as they are…They paint them as they themselves feel them to be,” we need impressionist data approaches. In opposition to the current data regime the Institute of Network Culture (INC) has focused on the production and support of ‘rather not’ theories and critiques of internet culture. One cannot expect that such data scepticism is met with enthusiasm. The untimely continental-European perspectives aim to build autonomous, interdisciplinary research networks on topics such as search, Wikipedia, social media alternatives and revenue models for the arts. INC does not believe in ‘data science’ and explicitly aims to undermine its core belief system: the data religion itself. This is not merely done out of resentment as decades of sadist neo-liberal budget cuts under the flag of the ‘creative industries’ have all but diminished the arts and humanities work. In this respect, we have not forgotten the loud silence of the so-called ‘hard science’ communities over the cruel policies that ultimately crippled the arts. We unapologetically believe in the subversive power of theory, philosophy, literature and the arts and the ultimate victory of poetry over bean-counting. Measurement is in the process of orchestrating a power grab, aimed to destroy critical thinking as such. There cannot be peace or mutual understanding in a world where data are explicitly utilized to eliminate culture.
>
>
> There is plenty of criticism of the capitalist data regime. We do not lack investigations of its implications, from David Beer’s The Data Gaze to Steffen Mau’s The Metric Society. In this context it is important to go beyond the—in itself convincing—US ‘bias’ studies of racist algorithms and post-colonial AI and focus on data as such, not just how they are optimized to discriminate along lines of caste, class, gender or ethnicity. The thesis here is that there is no ‘big data for good’. There is no positive telos. There is no progressive ranking and rating. While it is tempting to portray ordinary users as victims of ‘data colonialism’, as Couldry and Mejias do in > The Costs of Connection <https://colonizedbydata.com/>> , the ‘decolonize the internet’ metaphor may be misused.
>
>
> The key notion here is the continuing mechanism of extraction throughout the centuries. But this is where the comparison between colonial rule and platform realism ends. What is essentially different is the subject configuration. Colonialism is a form of violent rule while platforms are driven by the performative desire for comfort and social life among ordinary users. Adorno’s wrong life that cannot be lived rightly will have to be applied to the messy platform reality. There is no data ethics inside a wrong system. Facing the exclusion logic of the current data regimes, what’s needed is a relentless immanent critique and a halt to constructive collaborations. (6)
>
>
> What are the so-called ‘sciences’ doing to uphold the unfolding data-driven educational disaster? Are they ready to repair the damage done and dismantle their own datacenters? As we do not hear anything regarding the disastrous takeover of dataism, we recently started to argue for a defunding of all data sciences and AI research (including its ethics washing operations), calling for an immediate redistribution of research funds. The frustration in society with algorithms, artificial stupidity and facial recognition systems is already spilling onto the streets. Data have not been able to debunk antivaxers and no longer legitimize lockdown regimes. There is an urgent need to unmask the ‘neutrality’ of the libertarian male-geek computer science system, and take a stand: dismantle Facebook and Google now, ban Booking, Uber and Airbnb, build firm peer-to-peer (payment) systems and create one, two, many local non-profit cooperatives that focus on distribution as alternatives to, for instance, Amazon. The ultimate aim of dataism is becoming clear: to undermine the emergence of a new self that is no longer paranoid, depressed and insecure. What characters emerge once the performative quantified self metamorphoses?
>
>
> “If a vehicle accident occurs, you can call up the images that the vehicles involved recorded to decide what caused the accident and which algorithms need improvements.” This Forbes quote exemplifies dataism. Ethics commissions will not be able to make a difference and will only prolong the data regime. What we need is a ‘public stack’ of civic technologies that are truly sovereign monads, no longer based on the Brussels enforcement of interoperability between ‘open’ data clusters. What we need is data prevention, not protection [this is also really good]. Let’s design different protocols that end up in the collection of fewer data. Destroy data at the source, and no longer capture, let alone preserve them. This is the real ‘de-automation’ design challenge Rushkoff’s Team Human is facing, in line with Katherine Behar’s ‘deceleration’.
>
>
> What’s to be done after the deconstruction of the data cult? Often a void is felt after a rigorous exposure to the colonial core of today’s data practices. (7) Dismantling data colonialism will need to stress the invisible aspect of structural violence. And deal with the attractive side of ‘free’ apps and connectivity. This is offered, for instance, by Facebook, in places where mass poverty is used as a pretext for large scale extrativism in exchange for free services. Will ‘peak data’ in this context mean infrastructural breakdown? There must be an exit strategy developed, otherwise there is little else to do than resting on imperial ruins while still locked inside geo-political confinements. Users are not slaves, we need more precise categories here. Racism and discrimination are very real. Yet, they may not happen on the spot, on the visual interface level. Data are sold and stored and sorted, to be used later. Such delay and reframing can cause an unexpected revenge act where users are surprised by data that suddenly pops up and is used against them.
>
>
> One day, soon, people will wake up in disbelief, realizing that data is dead. The point is not to overcome the dark side of data, regulate IT giants and establish ‘responsible’ governance but to lay networked data amassing aside. Once system maintenance subsides, data gathering regimes fall in disrepair. Relational databases may still exist but one day they will simply stop bothering us. Fuelled by organized unbelief the invasive, sneaky, manipulative side of the measure mania fades away. Rarely anyone will remember the data religion.
>
>
> —
>
>
> 1. Jörg Später, > Siegfried Krakauer, Eine Biographie> , Suhrkamp Verlag, Berlin, 2016, p. 504.
>
>
> 2. > https://thenewinquiry.com/view-from-nowhere/
>
>
> 3. Katherine Behar, B> igger Than You: Big Data and Obesity> , Punktum Books, Earth, Milky Way, 2016, p. 39. This is also the argument of Wendy Chun’s Discriminating Data, MIT Press, Cambridge (Mass.), 2021.
>
>
> 4. Reference in friendly dialogue with > https://www.digitalearth.art/vertical-atlas> .
>
>
> 5. Katherine Behar, p.10.
>
>
> 6. This line of thought is inspired by Sunny Dhillon’s valuable analysis of the role decolonization rhetoric is playing inside the neoliberal university: > https://convivialthinking.org/index.php/2021/09/25/critique-of-decolonisation-projects/> . Slightly adjusted to the context here, he writes: “The theorist and activist entangled in the neoliberal university must resist piecemeal approach to a neatly packable, commodified rendering of supposed decolonial data practice. They must, instead, expend their energies on a relentless immanent critique of the discourses surrounding data decolonisation.”
>
>
> 7. Respect to the work of Nick Couldry and Ulises Ali Mejias. In particular their article The decolonial turn in data and technology research: what is at stake and where is it heading? > https://www.tandfonline.com/doi/full/10.1080/1369118X.2021.1986102> . The authors mention the principles of the Non-Aligned Technologies Movement, which are close to the ‘peak data’ exhaustion/data prevention agenda presented here: “Boycotting of extractivist technologies and the use of alternative tools; divestment at local and national government levels from Big Tech (by not buying or accepting their ‘free’ products); the re-appropriation of data (and the products of data) on behalf of those who generate it; the implementation of taxes and sanctions against Big Tech to repair the damage done by their technologies; the bolstering of public education – in the form of citizen research, literacy campaigns, decolonial thinking – to understand the dangers of data colonialism (..).”
>
>
>

#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mx.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org
#  @nettime_bot tweets mail w/ sender unless #ANON is in Subject:
References:
- <nettime> Proposition on Peak Data
  - From: Geert Lovink <geert@xs4all.nl>
Prev by Date: Re: <nettime> Further on Peak Data
Next by Date: Re: <nettime> Proposition on Peak Data (John Preston)
Previous by thread: Re: <nettime> Proposition on Peak Data
Next by thread: Re: <nettime> Proposition on Peak Data
Index(es):
- Date
- Thread