@prologic@twtxt.net The headline is interesting and sent me down a rabbit hole understanding what the paper (https://aclanthology.org/2024.acl-long.279/) actually says.
The result is interesting, but the Neuroscience News headline greatly overstates it. If Iāve understood right, they are arguing (with strong evidence) that the simple technique of making neural nets bigger and bigger isnāt quite as magically effective as people say ā if you use it on its own. In particular, they evaluate LLMs without two common enhancements, in-context learning and instruction tuning. Both of those involve using a small number of examples of the particular task to improve the modelās performance, and they turn them off because they are not part of what is called āemergenceā: āan ability to solve a task which is absent in smaller models, but present in LLMsā.
They show that these restricted LLMs only outperform smaller models (i.e demonstrate emergence) on certain tasks, and then (end of Section 4.1) discuss the nature of those few tasks that showed emergence.
Iād love to hear more from someone more familiar with this stuff. (Iāve done research that touches on ML, but neural nets and especially LLMs arenāt my area at all.) In particular, how compelling is this finding that zero-shot learning (i.e. without in-context learning or instruction tuning) remains hard as model size grows.