masto.ai is one of the many independent Mastodon servers you can use to participate in the fediverse.
A general Mastodon server for all languages.

Administered by:

Server stats:

2.1K
active users

#aihype

9 posts6 participants0 posts today

Once again, Germany will get a minister of research without the slightest clue of
- how research is done
- how research is organized
- what the current issues are
- etc.

However, she has the right party membership and has already demonstrated eloquently being able to combine having no idea with a strong opinion (#blockchain).

That and the fact that we are still in an #aihype let me fear the worst. Mark my words!

Continued thread

Footnote: The #AI2027 misunderstanding plagues most self-trained (non-academic) #AI researchers as well - successful research is much more "art" than "craft" - you can't brute force it by trying lots of hypotheses - there are just too many. Deep understanding and good intuition are how you find a golden needle in the haystack of researchable hypotheses. #aihype

Have you heard about the AI 2027 forecast? I don't believe it. IMHO the least plausible part is the leap from AI coding to AI research - the story totally underestimates the unimaginably vast spaces of potentially plausible hypotheses that good researchers must use their knowledge and understanding to prune down to hypotheses to actually test. Coding agents are not going to cut it (none available anytime soon.). #aihype #GenAI #LLM #AI2027

Inspired by @Iris 's recent poll, I suppose... I’m writing up my #psych #phd thesis, and am currently looking at the methods chapter. I’m describing all the samples, procedures, measures, statistical tools and procedures I’ve used in my articles, and ethical considerations. However, although I haven’t seen this in other theses, and although nobody has told me I need to do it, I feel like including a section on «the use of #AI technologies» (read: chatGPT and other LLMs). The thing is, I’m getting the sense that this has become extremely prevalent in a very short amount of time. If nothing else, than to use it «as a brainstorming partner», or help to paraphrase sentences for clarity or fix punctuation. And the reason I want to make a statement out of this in my thesis is that I haven’t. Not one bit, in the least sense. I never wanted to, and I’m very happy I haven’t. Is this worth making a statement of in the methods chapter? How would you go about writing it? What info would you include? Do you know good examples of this kinds of disclaimers/statements, in academic writing? #AIhype

MM: "One strange thing about AI is that we built it—we trained it—but we don’t understand how it works. It’s so complex. Even the engineers at OpenAI who made ChatGPT don’t fully understand why it behaves the way it does.

It’s not unlike how we don’t fully understand ourselves. I can’t open up someone’s brain and figure out how they think—it’s just too complex.

When we study human intelligence, we use both psychology—controlled experiments that analyze behavior—and neuroscience, where we stick probes in the brain and try to understand what neurons or groups of neurons are doing.

I think the analogy applies to AI too: some people evaluate AI by looking at behavior, while others “stick probes” into neural networks to try to understand what’s going on internally. These are complementary approaches.

But there are problems with both. With the behavioral approach, we see that these systems pass things like the bar exam or the medical licensing exam—but what does that really tell us?

Unfortunately, passing those exams doesn’t mean the systems can do the other things we’d expect from a human who passed them. So just looking at behavior on tests or benchmarks isn’t always informative. That’s something people in the field have referred to as a crisis of evaluation."

blog.citp.princeton.edu/2025/0

CITP Blog · A Guide to Cutting Through AI Hype: Arvind Narayanan and Melanie Mitchell Discuss Artificial and Human Intelligence - CITP BlogLast Thursday’s Princeton Public Lecture on AI hype began with brief talks based on our respective books: The meat of the event was a discussion between the two of us and with the audience. A lightly edited transcript follows. Photo credit: Floriaan Tasche AN: You gave the example of ChatGPT being unable to comply with […]

Who could have predicted this? 🙄 state-of-the-art LLMs score 5% on the 2025 mathematical olympiad despite having been trained extensively on past editions :

arxiv.org/abs/2503.21934

arXiv logo
arXiv.orgProof or Bluff? Evaluating LLMs on 2025 USA Math OlympiadRecent math benchmarks for large language models (LLMs) such as MathArena indicate that state-of-the-art reasoning models achieve impressive performance on mathematical competitions like AIME, with the leading model, o3-mini, achieving scores comparable to top human competitors. However, these benchmarks evaluate models solely based on final numerical answers, neglecting rigorous reasoning and proof generation which are essential for real-world mathematical tasks. To address this, we introduce the first comprehensive evaluation of full-solution reasoning for challenging mathematical problems. Using expert human annotators, we evaluated several state-of-the-art reasoning models on the six problems from the 2025 USAMO within hours of their release. Our results reveal that all tested models struggled significantly, achieving less than 5% on average. Through detailed analysis of reasoning traces, we identify the most common failure modes and find several unwanted artifacts arising from the optimization strategies employed during model training. Overall, our results suggest that current LLMs are inadequate for rigorous mathematical reasoning tasks, highlighting the need for substantial improvements in reasoning and proof generation capabilities.
#ai#AIhype#llm

"My core theses — The Rot Economy (that the tech industry has become dominated by growth), The Rot-Com Bubble (that the tech industry has run out of hyper-growth ideas), and that generative AI has created a kind of capitalist death cult where nobody wants to admit that they're not making any money — are far from comfortable.

The ramifications of a tech industry that has become captured by growth are that true innovation is being smothered by people that neither experience nor know how (or want) to fix real problems, and that the products we use every day are being made worse for a profit. These incentives have destroyed value-creation in venture capital and Silicon Valley at large, lionizing those who are able to show great growth metrics rather than creating meaningful products that help human beings.

The ramifications of the end of hyper-growth mean a massive reckoning for the valuations of tech companies, which will lead to tens of thousands of layoffs and a prolonged depression in Silicon Valley, the likes of which we've never seen.

The ramifications of the collapse of generative AI are much, much worse. On top of the fact that the largest tech companies have burned hundreds of billions of dollars to propagate software that doesn't really do anything that resembles what we think artificial intelligence looks like, we're now seeing that every major tech company (and an alarming amount of non-tech companies!) is willing to follow whatever it is that the market agrees is popular, even if the idea itself is flawed.

Generative AI has laid bare exactly how little the markets think about ideas, and how willing the powerful are to try and shove something unprofitable, unsustainable and questionably-useful down people's throats as a means of promoting growth.
(...)
In short, reality can fucking suck, but a true skeptic learns to live in it."

wheresyoured.at/optimistic-cow

Ed Zitron's Where's Your Ed At · The Phony Comforts of AI OptimismA few months ago, Casey Newton of Platformer ran a piece called "The phony comforts of AI skepticism," framing those who would criticize generative AI as "having fun," damning them as "hyper-fixated on the things [AI] can't do." I am not going to focus too hard on this blog, in

AI Will Replace Engineers? 🤖
Oh, sweet summer child…

AI isn’t thinking. It’s autocomplete on steroids.
It guesses confidently, fails quietly, and we pretend it’s magic.

Engineers won’t be replaced.
They’ll be cleaning up AI’s mess.

But sure, dream of your AI CEO…
Full Post: linkedin.com/posts/yuna-morgen