| Pages:
1
2 |
woelen
Super Administrator
       
Posts: 8149
Registered: 20-8-2005
Location: Netherlands
Member Is Offline
Mood: interested
|
|
It indeed is a matter of training data. In theory we could make an AI, which would be amazing at chemistry if it worked through terabytes of texts
about chemistry and if people were supervising it (remember, just passing data is not enough, supervised learning is used).
I myself installed Stable Diffusion on my PC with an RTX 3060 video adapter for the inference engine, and there I observe the same issue with training
data. If I ask the system to generate images of people, including hands and feet, then frequently, the hands and feet look deformed (strange fingers,
not 5 fingers on one hand, feet melted together, etc.), while faces look very natural. This is because in most models the trainign set contains many
many faces, while the number of hands and feet in images is much lower. There are models, which were trained for the special purpose of generating
images of people, and these models perform better, but these same models perform worse on general images, like landscapes, machines, etc.
I also tried what the standard Stable Diffusion 1.5 model makes from a prompt like "cloud of nitrogen dioxide". it produced a picture of a landscape
with houses and trees, and a big white cloud originating from some spot. It looked like a quite realistic picture of a steam cloud, but not like NO2
at all.
|
|
|
teodor
International Hazard
   
Posts: 1142
Registered: 28-6-2019
Location: Netherlands
Member Is Offline
|
|
Some part of scientific knowledge organization requires logical reasoning, not only training by examples. This is about the fact that any rule can
have exceptions and they should be organized like a system of knowledge. The disadvantage of neural networks is they could not reveal how the data is
organized.
|
|
|
SnailsAttack
Hazard to Others
 
Posts: 175
Registered: 7-2-2022
Location: The bottom of Lake Ontario
Member Is Offline
|
|

the uses of ChatGPT range from a novel search engine with a uniquely governable perspective to a shitpost generator
|
|
|
mayko
International Hazard
   
Posts: 1229
Registered: 17-1-2013
Location: Carrboro, NC
Member Is Offline
Mood: anomalous (Euclid class)
|
|
There's a riddle I've always remembered, partly because of how foolish I felt at being stumped, when I heard the answer:
| Quote: | | A father and his son are going on a fishing trip. They load the car up with tackle boxes and a cooler, and head out of town for their favorite spot on
the river. Unfortunately, as they merge onto the freeway, a semi changes into their lane and rams their station wagon off the road. The father is
killed instantly, and the son, mortally wounded, is rushed to the hospital, where the nurses hurriedly prep him for surgery. But then the surgeon
comes into the operating room, turns pale, and says: "I can't operate on this boy! He's my son!" |
I have admit, the current generation of "AI" is a big step up from the ELIZA chatbots I grew up with, in terms of size and sophistication. Still, I
would not take their output much more seriously than I would a known bullshitter's, and I don't expect their downsides to be resolved simply by
throwing larger and larger training sets at them.
One thing is that there's a ceiling to the amount of text available for training, with some forecasts predicting exhaustion within a few years. Using
model-generated input can tank model performance. This means that we probably can't just turn the spigot and generate a useful dataset. It also means that as the web fills
up with algorithmically generated press releases, wikipedia articles, and forum posts, training sets are likely to be of lower and lower quality. Even
attempts to gather fresh data might be difficult, given the temptation to get a machine to produce it: a recent survey found that more than a third of Mechanical Turk workers were using LLMs to generate content. (I think that this is especially ironic given the mechanical Turk aspects
of "artificial" intelligence systems, which cat require substantial human labor needed behind the scenes in order to function. This labor often takes
place under outright exploitative conditions)
A different point comes from this example riffing on a joke by Zach Weinersmith about the infamously counterintuitive Monty Hall problem :


The software can generate syntactically well-formed sentences, and a large probability distribution will tend to make them pragmatically and
semantically correct, or at least sensible. But is that really substrate for abstract reasoning? In these cases, apparently not; even when it comes up
with the correct answer, it can't explain why; it just free associates in the probabilistic neighborhood of "goat" + "door" + "car". For this
question at least, it would probably have done better trained on a smaller corpus that didn't discuss the probability puzzle!
Similar is their tendency to simply make stuff up, fabricating everything from research papers to court cases. Is this a symptom of too little training data? Or is it just something that happens in systems sufficiently large and complex to give
the appearance of knowledge? In the second case, why wouldn't a bigger corpus do anything but give more material to draw hallucinations from?

Some attorneys are women, like some surgeons. Maybe they don't make up a majority of either profession, but when confronted with its parsing error,
the chatbot's justification isn't quantitative; the response is to chide the user, for constructing a sentence so badly as to suggest such logical
nonsense as a pregnant attorney! A better explanation is the that the texts it trained on weren't written from a social vacuum. They generally came
from societies where the riddle I started with might be puzzling. If you fed it more texts written from within the same social arrangement, or one
that shares many biases and unstated assumptions, would it get better at avoiding them?
Honestly, it's starting to sound familiar...
al-khemie is not a terrorist organization
"Chemicals, chemicals... I need chemicals!" - George Hayduke
"Wubbalubba dub-dub!" - Rick Sanchez
|
|
|
macckone
Dispenser of practical lab wisdom
   
Posts: 2202
Registered: 1-3-2013
Location: Over a mile high
Member Is Offline
Mood: Electrical
|
|
ChatGPT has had various instance of going horribly wrong.
If it doesn't know the answer it makes crap up.
Which to be fair, humans do too.
The problem occurs when you try to use a language model for empirical knowledge.
Examples - a chemical procedure, a legal citation, building a bridge, diagnosing an illness
It also won't fly planes or drive cars.
Using it with computer code can get you about 80% of the way there, but so can a beginning programmer.
The difference being ChatGPT code is syntactically correct but algorithmically dubious for any complex problem.
ChatGPT is useful for taking knowledge you already have and transforming it into writing.
It is not useful for answering questions requiring accurate answers to complex multistep problems.
|
|
|
mayko
International Hazard
   
Posts: 1229
Registered: 17-1-2013
Location: Carrboro, NC
Member Is Offline
Mood: anomalous (Euclid class)
|
|
"thanks; I hate it"
al-khemie is not a terrorist organization
"Chemicals, chemicals... I need chemicals!" - George Hayduke
"Wubbalubba dub-dub!" - Rick Sanchez
|
|
|
mayko
International Hazard
   
Posts: 1229
Registered: 17-1-2013
Location: Carrboro, NC
Member Is Offline
Mood: anomalous (Euclid class)
|
|
You know how in movies and TV you'll see molecules drawn on the blackboard in the background, but if you pause and zoom in you'll find hexavalent
carbon and the like?
The chemistry community should ban [from publication and instructional aids] drawing chemical structures with generative AI, chemists warn
| Quote: |
Moores says that what ultimately triggered her to write the opinion piece was a cover in a green chemistry journal where AI had been used to generate
chemical structures that, she says, were clearly wrong. ‘It was on LinkedIn, it was really sleek, it was very nice. And [then] you [really] looked
at it and the chemical structures were wrong.’
Moores says that she has played with the likes of ChatGPT and Copilot and asked both to produce different visuals for an inorganic chemistry course
she teaches. ‘I wanted to make different illustrations for my course, and it was a disaster … specifically I asked Copilot to represent inorganic
chemistry and I asked for the periodic table, and it was not at all the periodic table. I did a lot of prompt engineering to try to make it better,
and it was clearly not able to handle that.’
|
I was surprised at how much trouble the AIs had with the structure of caffeine, given its popularity on shirts and mugs of coffee nerds.
I also remembered something I'd read a while ago, about a physicist who started a side hustle as a professional consultant to the cranks who kept emailing her colleagues with new theories of everything.
Their money is as green as anyone's, and she came away with some insights about science communication and outreach. Her clients tended to rely
exclusively on popular sources like National Geographic/Scientific American/Popular Mecanics rather than technical sources. A consequence was that
they would often take literally the artistic fluff the magazines use to jazz up their illustrations:
| Quote: |
Science writers should be more careful to point out when we are using metaphors. My clients read way too much into pictures, measuring every angle,
scrutinising every colour, counting every dash. Illustrators should be more careful to point out what is relevant information and what is artistic
freedom.
|
This seems relevant given that the some of the examples given, especially from Google Gemini, are not just inaccurate but embellished with that same
distracting fluff
al-khemie is not a terrorist organization
"Chemicals, chemicals... I need chemicals!" - George Hayduke
"Wubbalubba dub-dub!" - Rick Sanchez
|
|
|
BromicAcid
International Hazard
   
Posts: 3315
Registered: 13-7-2003
Location: Wisconsin
Member Is Offline
Mood: Rock n' Roll
|
|
I don't use it. Since I've mentioned before who I work for, I can't go into specifics but it's making a big problem that I only see as getting bigger
every day.
[Edited on 10/26/2025 by BromicAcid]
|
|
|
MrDoctor
Hazard to Others
 
Posts: 236
Registered: 5-7-2022
Member Is Offline
|
|
Not chatGPT but, grok, i find extremely useful, although like any LLM the value in its responses diminishes rapidly with subsequent questions,
especially if you ask it to clarify something, it tends to not notice if its wrong until explicitly pointed out, and then, it just totally flipflops
because you asked it to reconsider, so it didnt reconsider it just shifted its approach to to maintain the perception of agreeableness, you might have
noticed, its impossible to debate LLMs properly, they are either too rigid or too fluid.
grok is great though for finding resources, and also explaining the fundamental mechanics of a kind of reaction in the context at hand, and if you are
aware of its limitations, you can go really far with it.
It should be noted though, making an LLM write you a "recipe" or reaction instructions, for anything, will almost certainly result in either failure,
or just doing things in a really stupid unintuitive way, LLMs seem to favor patent science for this, or writeups written by novices asking "will this
work", if you half understand the reaction then there isnt really anything more to it, you just mix A with B, maybe under special conditions C, for
time D, you cant trust the LLM to maintain molar calculations and balances, and even then, we use excess where appropriate, not everything is done
stoichiometrically perfect, if it needs to be, then you have to titrate for things where that can get complicated like using unstable reagents, e.g
bleach or peroxide, which never match the label exactly.
I use it for finding niche reaction pathways, and i always make sure to have it pull up citatations for things ive never heard of before, above all
else i explicitly research what it gives me, usually i can copypaste chunks of text like about how catalyst acid-sites work, and ill get similar
explanations on scholar search which i now understand better because grok already framed the concepts in relation to the task at hand, but now i can
actually comprehend more advanced literature, and usually tell when grok BS'd or not. If all else fails it leaves me with a nice direct question for
actual peers, for which i do my due dilligence rather than burdening them with "AI said this, is it true?"
AI is a tool that demands a bit of effort and self dicipline to apply properly, the quality of the information you get is proportional to the effort
you handle the tool with. if you cant possibly validate what the AI spits out though you probably shouldnt be doing whatever it is you are doing, and
instead focus on extending your fundamental concepts.
A good example is, basically any sort of hydride or similar compound, extremely powerful reducing agents that are pyrophoric in normal air. I once
thought they were no big deal to try making, because the wiki articles present the synthesis in a pretty concise manner, but it also oversimplifies
contact process too. but to actually do the synthesis requires SO many safeguards, a well tuned reaction, and experience to handle everything safely
without blowing up. but most summaries dont address the nitty gritty because anyone in the know just knows, nitrations are dangerous, hydrogenations
leak hydrogen, alkali metals ignite in air, etc, but AI has no issue suggesting you first grind up your lithium in a mortar and pestle to 200 mesh,
because irl, the PPE requirements are obvious and dont have to be mentioned, beyond noting that lithium powder is pyrophoric, though, dry lithium
powder will pretty much explode on contact with oxygen if its fine enough, that nuance is lost on AI because, nobody ever talks about the obvious, it
just sees "be careful", and repeats that.
|
|
|
Texium
Administrator
      
Posts: 4729
Registered: 11-1-2014
Location: Salt Lake City
Member Is Offline
Mood: Seeking gainful employment
|
|
Yeah, I’m gonna pass on getting advice from the robo-Nazi.
|
|
|
| Pages:
1
2 |
|