We need to talk about data centres.

For the 2nd or 3rd time this week I've seen someone comment on a new data centre build with a stat about how 80% of data is never accessed. Then they talk about the energy and cooling used in modern DCs.

The reality is that data storage is actually incredibly efficient, and uses fuck all power. A hard disk is less than 10w and stores multiple users data.

Storing data, our photos, our memories, our history. Is not the problem.

What is? 1/n

The thing driving the need for the bigger more power and water hungry data centres is AI. Sparkling autocarrot. Where as a machine in a rack full of hard disks might consume a couple of hundred watts. A machine loaded up with a typical load of 8 "AI accelerators" can be pulling in the region of 5kw. Over an order of magnitude more power than the energy needed to store the lifes photos of hundreds of people.

And why ? To what end?

2/n

I've worked in this industry for over a quarter of a century.

At no point have I found myself thinking "I wish I could just ask the computer to write this email for me" "I wish the computer could write my code for me". MS is adding co pilot function to lots of products. Not as opt in. But opt out. And it's a right hassle to turn it off? Why? So someone can ask it to right a longer email from a prompt that the recipient can then ask the AI to summarise for them ?

3/n

Follow

There are certainly some areas where machine learning (note, I'm not calling it AI) has it's uses. Medical research springs to mind. But a ubiquitous AI assistance rolled into all our products? Why? It's just using too much power, too many resources, and for what ? Sparkling autocarrot.

Encoding the worst of our society in a bit stream. Exacerbating inequality, prejudice, and hate.

In comp sci there's a term. GIGO. Garbage in. Garbage out.

4/n

These large language models are being fed on the combined mass of the world's online content. Your tweets, Facebook posts, forum posts, that blog you forgot you started. All of it is being fed into the black box of the LLM. The internet provides us unprecedented access to the world's information. But it is also an unprecedented collection of hate. We've seen this time and time again. From chat bots that start shouting nazi propaganda, to CV vetting systems that won't hire women.

5/n

Garbage in. Garbage out.

And what makes this even more terrifying is that when you look at a webpage, it's often hard to tell if it's been generated with sparkling autocarrot, or written by a human. If we can't tell, then what hope does the LLM? And so we're gonna end up with the next generation models being fed on the output of the previous. This is going to create feedback loops. Reinforcing the worst the model has to offer. Strengthening the hate. The prejudice.

6/n

And because we don't know what has been created how. There's no way to control what feeds the models. It's just gonna enshitify. And fast.

AI has all the hallmarks of a bubble. Like crypto before it, and half a dozen other bubbles before that, that all share their heritage right back to the south sea bubble (no, not tulip mania, but that's something for a different thread).

Except this bubble has gone more mainstream. It's consuming way more resources. Than any before it.

7/n

Water is going to become the next big inequality front. As the climate changes. Clean fresh water is going to become harder to come by. More expensive, and more unequal. That same water is being poured over panels I'm data centres to cool the servers. To cool the AI accelerators, generating content noone asked for. Enshitifying the knowledge base of humanity. Just so a few people can make some money.

8/n

Storing our data, our memories, our photos, on servers in data centres that are built in sensible places isn't inherently a bad thing. And we shouldn't allow ourselves to fall for the trope of 80% of it is never accessed. But building datacentres that use ten times the energy, and need even more water, in deserts, and water stressed areas, to drive sparkling autocarrot that noone asked for. That we should be more vocal about.

9/9.

@quixoticgeek Training GPT-3 took as much water as 460 hamburgers.

You raise a very good point about location of data centers. There's plenty of room along the Wisconsin and Michigan coastlines with access to more fresh water than could ever be consumed in a thousand years. If water were priced appropriately in drought-ridden areas, I imagine data centers would be happy to relocate. First step is to fix the broken politics that subsidizes silage/beef

@jamiemccarthy @quixoticgeek why is fresh water being used to cool data centers? The data center machines couldn’t care less if it is clean or not. That seems a big oversight.

(And before someone says salt water corrodes, why the f would you need to put the water right against the machines, it just needs to absorb thermal energy, not make out with it.)

@davidaugust @jamiemccarthy @quixoticgeek Some companies do use seawater when when it's the best option.

When I was at Google a number of years ago, we had facilities in Finland using seawater, and facilities in a number of other places that used grey water.

(grey water being sewage, runoff, etc. that's gone through water treatment)

For a long time most companies used colos run by companies that had zero incentive to make their facilities power-efficient (they were the "industry standard"). However, over the last few years, I think a lot of companies have followed Google's lead and stopped wasting massive amounts of energy and water on cooling.

(Microsoft has done some really interesting work with passive cooling by sinking clusters in sealed pressure-vessels under the sea)

@distributednerd @davidaugust @quixoticgeek Fascinating! From having had a saltwater aquarium years ago, I would have assumed salt would just accumulate on *everything* and be a huge pain to clean. Good to know that can work.

@jamiemccarthy @distributednerd @davidaugust @quixoticgeek I'd presume that it uses heat exchangers to pass heat from liquid being used to cool the systems to the seawater, not to directly cool the systems. There are some facilities are doing heat exchange to heat local swimming pools.

@distributednerd @jamiemccarthy @quixoticgeek that’s exciting to learn of that progress. Sounds like things may be moving in a good direction.

@distributednerd @davidaugust @jamiemccarthy @quixoticgeek Excess heat is like any other pollution...there is no "away." I guess sinking it in the ocean depths has less impact than dumping it in a river, but it's still going to disrupt ecosystems on a local scale.

@phil_stevens @distributednerd @davidaugust @jamiemccarthy @quixoticgeek Iceland is a great place for data centers - 100% clean energy, and they can use the waste heat most of the year (and if not, the local ecosystem are used to volcanoes, so a bit of warm water is not unexpected). But of course the ping is 3 microseconds slower...

@quixoticgeek @phil_stevens @distributednerd @davidaugust @jamiemccarthy For me the answer is usually caffeine deficiency syndrome. I use primitive devices that mostly leave my writing alone...

@phil_stevens @distributednerd @davidaugust @quixoticgeek Unlike nuclear reactors, my understanding is that data centers end up evaporating all or nearly the water they use, so there’s no need for a cooling tower before returning water to its source. I could be wrong

@jamiemccarthy @phil_stevens @distributednerd @davidaugust that is the case with terrestrial DCs. Several people have commented on the MS experiment with an underwater DC. In that case it does warm the local water.

@davidaugust @jamiemccarthy @quixoticgeek Because the large scale datacenters use evaporative cooling given this is really power efficient and this has to be clean water to prevent salt deposits in the cooling system. And it actually uses the water given its get evaporated away and not everything is beeing recovered. Or they will only absorb the heat into the water and then just dump it out again given that is more energy efficent then having to cool it down again etc.

@quixoticgeek as a practical matter, don't think we can guilt trip users into pretending a given convenience a) isn't cool after all or b) should not be used. People get into accidents because they're idiots and use their cell phones while they're driving. Even so, cell phones are part of life now.

What we can do is tax/regulate the hell out of large commercial entities that, in making cool new tech convenient, move us 1 cm closer to climate disaster. You don't get to make money for doing that.

@YusufToropov @quixoticgeek

But that's a handwave. "We" won't tax the hell out of them, any more than we tax the hell out of billionaires.

This "incentivize good choices" model is what's gotten us an environment in which everybody is pissing microplastics and the amount of CO2 in the atmosphere went up by 1% in a year in 2023.

Regulation works if people decide to do it. This argument is what's been convincing otherwise smart people not to try to do it.

@abhayakara @quixoticgeek

Answer: vote and organise. For Democrats, if you happen to live in the USA.

And yes, this is why Europe is farther along on this. Americans can't seem to grasp the absurdly high stakes. Every syllable of what you just wrote is why I'm trying to make sure we don't get distracted/manipulated into electing a climate denier slate nationwide. Govt must take this seriously.

Guilt tripping users isn't the answer regardless. That's another distraction in my humble opinion.

@abhayakara @quixoticgeek

And by the way I would start with Exxon when it comes to taxing and regulating the hell out of people

@YusufToropov @quixoticgeek

I agree. But I don't think OP was guilt tripping anyone. OP was reporting what's going on. That's important.

@abhayakara @quixoticgeek

This felt like guilt tripping:

"But building datacentres that use ten times the energy, and need even more water, in deserts, and water stressed areas, to drive sparkling autocarrot that noone asked for. That we should be more vocal about."

I prefer we talk about legislation that keeps companies responsible. I disagree that I don't want AI. I *do* want AI. Inaccurate to say no one is asking for the next wave of development. I am. Even if I don't know what it is.

@YusufToropov @quixoticgeek

Okay, but the ask was literally "that we should be more vocal about" [it].

I think LLMs are interesting, and GAI would also be interesting. But we actually already have GAI. It's called corporations and societies. These are artificial cognitive structures, built by humans, running on a substrate that is known to be intelligent.

So if you're interested in how to improve the state of AI, you don't have to wait!

@abhayakara @quixoticgeek

Just give the people who are building responsible data centers a huge subsidy and slap huge taxes and fines all the rest. I don't try to dictate what the marketplace is going to create, or what is going to be cool, a year from now. It's pointless.

@YusufToropov @quixoticgeek

Maybe a little polderpolitiek wouldn't hurt, though. I agree that we shouldn't be legislating outcomes, but right now we're deliberately pricing externalities out, which is a subsidy that always and only benefits approaches that are wasteful. We are not starting with a level playing field, so of course we see abuses.

@YusufToropov @abhayakara @quixoticgeek You’ve missed the point completely. We are wasting electricity on machines performing cheap tricks. Feeding more data into LLMs won’t make them better. The entire bubble is nonsense, pumping carbon into the atmosphere for zero benefit…. In fact, even a detriment to society instead.

@YusufToropov @abhayakara @quixoticgeek

I want AI for use in data science and speeding up workflow in areas, like research, that benefit humans.

What I don't want is AI that is so ubiquitous that it wastes energy and outputs inferior quality information and generally enshitifies the planet.

We also have to have AI because of the arms race it is. The way to beat a malicious AI is likely to be one that offers defense against it.

This isn't average-joe users fault, though. It's 💯 on the companies and governments that are recklessly pursuing unethical or outright pointless versions of AI with no regard for any damage caused in the process.

@IntentionallyBLANK @abhayakara @quixoticgeek

Exactly. Personally, I would be cool seizing and nationalizing the assets of any company that in 2024 runs data centers that aren't carbon neutral, but hey, I'm on the margins.

Just don't tell me we don't need R&D in this area because we do.

@YusufToropov @abhayakara @quixoticgeek
Agreed. There are a number of things that I think should be nationalized, just because they're either too essential or too dangerous to "let the invisible hand of the market decide".

I say this recognizing that a camel is a horse drawn by a committee and some of the absurdity of government projects.

Right now I feel like we've handed the private sector our generation's version of the Manhattan Project and said, "Go do your thing." (Side note: I've never watched Oppenheimer, though I have read a bit about the Manhattan Project and love Feynman's memoir. Even as batty as that was it's still more sane than Silicon Valley.)

@IntentionallyBLANK @abhayakara @quixoticgeek

Yes. We need BIG ACTION NOW and nationalizing companies that don't get it is utterly appropriate. I am all in for this and a politician who can build escalating mainstream support for it.

@YusufToropov @IntentionallyBLANK @quixoticgeek

But this is all mental masturbation. Where is the govt that is going to take this big action?

Expecting a heroic rescue is a recipe for continued disaster.

@abhayakara @YusufToropov @quixoticgeek
For me, personally, it's not much, but I plan to incorporate as much learning about AI into library programming. I'm pushing for a science adult programming series that gets people talking with experts in the field talking with people in the community about STEM topics that impact all of us. I'm really hoping to do things to get people thinking practically about AI (not going off about Skynet or Rokos Basilisk or something stupid, nor thinking that "prompt engineering" is what K-12 kids need more than fundamental instruction in reading, writing, math, and critical thinking).

So, basically, getting the word out about what AI is and isn't, while trying to facilitate discussion about resources, etc. I don't have my hands on the levers of power so to speak, but I am a library and research critter. In short, the answer is to do what you can within your sphere of influence. Even if it's tiny.

@YusufToropov @quixoticgeek

BTW, Americans do grasp the absurdly high stakes, and the majority vote accordingly.

The difference between America and, say, the Netherlands, is that in the Netherlands when Geert Wilders gets the biggest number of seats, the question is, will Yesilgöz cooperate with him. In the U.S., it's "well, I guess Trump won even though the majority voted against him."

Our news media is Pravda from the 80's, even the "left media." It's very difficult here to get real news.

@YusufToropov @abhayakara @quixoticgeek the reason why the USA is so far behind Europe is simply that our oligarchs have much tighter control over public discourse, because the corporate media are all billionaire owned, and particularly since the execrable Citizens United decision, both major parties are dependent on overlapping subsets of those same billionaires for campaign funding. The oligarchs have captured the public sphere.

@YusufToropov @abhayakara @quixoticgeek

*Your* answer is facile. Yes, organize, and slow the collapse with voting when it'll work, but voting is less and less powerful as time goes on thanks to an unpacked SCOTUS and very well-lubed fascist political regime. National Dems are at best nearly completely useless, either unable or--more likely--unwilling to pursue policies that will alienate bourgeois donors.

Dems aren't the answer, and never can be. Nor are you making sure of anything, expat.

@YusufToropov @abhayakara @quixoticgeek

I doubt you've had your foot on the ground in a while, since you don't seem to understand that climate denial was already the entire sweep in 16. In 2000, when SCOTUS decided an election for a climate-denialist and oil exec.

We get the stakes, we're just fucking exhausted. Work is hell and we can barely afford a roof over our heads, and the population that can't grows all the time.

Standard Oil should've been and can still be nationalized.

@abhayakara @YusufToropov @quixoticgeek
Pissing microplastics is probably the best case scenario. At least that would suggest it actually leaves the body.

As far as taking care of the environment goes, you can't leave that to the angels of their better nature. Not any more than you can do that with "charity". It's either demanded & enforced, or it doesn't happen or happens to such a low degree as to be roughly equivalent to it not happening.

@YusufToropov @quixoticgeek I fully intend on guilt tripping users into thinking AI isn't cool. Gonna do it as much as I possibly can.

@YusufToropov @quixoticgeek users aren't demanding it, companies are forcing it on them

@quixoticgeek
1. LLMs are useful far beyond sparkling autocorrect. In fact their embeddings are arguably their most useful feature.
2. LLMs provide a lot of different options for accessibility needs. From sparkling autocorrect allowing for much better encoding and decoding of voice data, to helping with attention management.
3. Immersion Cooling (stick your server in vegetable oil) could already be being used to save water. They don't because of graft. Why pay $9 per gallon of vegetable oil that will last forever, when you can pay .35 cents (2500 times less)? States are literally giving companies water and paying them to use more of it.
4. We have known for over a decade how to remove bias from LLMs. It sometimes degrades performance, so they've decided not to do that. In other ways they have debiased it.
5. It's flatly not true that learning from the data it creates will reinforce bias. In fact, given the way they currently are designed, it will reduce bias. Also, if we can't tell that it was made by a machine, then it doesn't matter whether or not it was.

@quixoticgeek i worry AI is going to have a similar curve to email but harsher: I feel like email started as this nice efficient service then got infested with spammers and junk and now a huge chunk of email is just garbage. AI seem to be skipping the ‘efficiency’ stage and just right to bloating.

@dalias While I agree with the main point of @quixoticgeek that pursuit of the (IMO garbage) goals of AI is much more of a problem than data, it's not true that storage costs zero differentially when not accessed. A good number for a mix of spinning disk and flash is about a kW of power usage per petabyte. That would be roughly a megawatt for a medium-sized hyperscaler data center, roughly a percent of its usage. Smaller than the AI/CPU/GPU footprint, but not at all zero or even close to zero.

@AlanSill @quixoticgeek OK, what I should have said is that, in some sort of asymptotic sense with proper optimization, data not accessed consumes no energy. Imagine an ongoing defrag-like process that migrates data that has not been accessed in a long time to storage devices that are physically powered down, or even a setup where each client's data (at some fine grain) is on a separately-sleepable eMMC chip or similar.

@quixoticgeek Stochastic Parrots Considered Harmful

Sparkling Autocarrot Considered Harmful

Take a look some time at Ireland's current and short term projected datacentre electricity use as % of total generation. 20% now, 70% by 2030. That's what a friendly corporate tax haven regime gets you these days.

@quixoticgeek GenAI also adds nothing to our overall cultural value. Every piece of text or scribble by a human would be valuable to a future historian trying to understand our time. "I had great Pho last night" posted from Longmont Colorado in 2003 tells a historian about cultural spread and acceptance in our time. A doodle of a person with a selfie stick blocking a vista tells an entire story. GenAI blurs and obscures that. It tells nothing about any human attitude aspirations or vision.

@quixoticgeek Clear and compelling warning about natural consequences of current round of commercialization of AI research!
The threat to drinking water might be less severe -- Microsoft (and maybe others?) have been researching "underwater datacenters" in deep sea water to manage the cooling needs,

@bobhy underwater datacentres are unfortunately still a gimmick. They are too difficult to service the content. And sticking stuff in the sea is not a panacea. The heat still goes somewhere. You heat up the water around the DC, that changes the ecosystem. We see this with the vents for the cooling water at nuclear power stations. The warm water changes the wildlife there.

@quixoticgeek Agree with all of these caveats, avoid one problem, create another. The trick is to create problems at a somewhat larger scale than you were avoiding the original problems at.

@quixoticgeek The larger point -- once these AI "factories" are built, their owners *need* to crank out "product". How can we get the most social benefit from the ongoing resource cost?
If not LLM-generated love letters (as Edge / Copilot is offering me this morning), then maybe a more limited research assistant to help me find relevant factual information, maybe flag bots and troll farms in social contexts?
Leave the human-oriented creative stuff to humans?

@quixoticgeek I'd like to highlight that "AI" isn't actually anything new, its just larger than ever now. Which is why this serverfarm issue isn't new. Before it mostly came in the form of surveillance advertising & personalized recommendations.

*Our need for an income is what drives waste!*

The legit value of serverfarms (though I am sympathetic to arguments that we over-rely on this) is near-entirely in data storage/publishing. And message routing.

The legit value costs near-nothing.

Sign in to participate in the conversation
(void *) social site

(void*)