I'm getting ai fatigue. It's ok to rewrite quick emails that i'm having brain farts on but anything deep it just sucks. I certainly can't see paying for it.

▲ aurareturn 3 days ago | parent | next [-]

Weird because AI has been solving hard problems for me. Even finding solutions that I couldn’t find myself. Ie. sometimes my brain cant wrap around a problem, I throw it to AI and it perfectly solves it.

I pay for chatgpt plus and github copilot.

▲ leptons 3 days ago | parent | next [-]

It is weird that AI is solving hard problems for you. I can't get it to do the most basic things consistently, most of the time it's just pure garbage. I'd never pay for "AI" because it wastes more of my time than it saves. But I've never had a problem wrapping my head around a problem, I solve problems.

I'm curious what kind of problem your "brain cant wrap around", but the AI could.

▲ aurareturn 2 days ago | parent | next [-]

  I'm curious what kind of problem your "brain cant wrap around", but the AI could.

One of the most common use cases is that I can't figure out why my SQL statement is erroring or doesn't work the way it should. I throw it into ChatGPT and it usually solves it instantly.

▲

Wilduck 2 days ago | parent [-]

Is that a "hard problem" though? Really?

▲

aurareturn 2 days ago | parent | next [-]

Yes. To me, it is. Sometimes queries I give it are 100-200 lines long. Sure, I can solve it eventually but getting an "instant" answer that is usually correct? Absolutely priceless.

It's pretty common for me to spend a day being stuck on a gnarly problem in the past. Most developers have. Now I'd say that's extremely rare. Either an LLM will solve it outright quickly or I get enough clues from an LLM to solve it efficiently.

▲

Draiken 2 days ago | parent | next [-]

You might be robbing yourself of the opportunity to learn SQL for real by short-cutting to a solution that might not even be correct one.

I've tried using LLMs for SQL and it fails at exactly that: complexity. Sure it'll get the basic queries right, but throw in anything that's not standard every day SQL into it and it'll give you solutions that are not great really confidently.

If you don't know SQL enough to figure out these issues in the first place, you don't know if the solutions the LLM provides are actually good or not. That's a real bad place to be in.

▲

navigate8310 2 days ago | parent | prev | next [-]

Usually the term, "hard problem", is reserved for problems that require novel solutions

	▲	IgorPartola 2 days ago \| parent \| next [-]
		Have you ever read Zen and the Art of Motorcycle Maintenance? One of the first examples in that book is how when you are disassembling a motorcycle any one bolt is trivial until one is stuck. Then it becomes your entire world for a while as you try to solve this problem and the solution can range from trivial to amazingly complex. You are using the term “hard problem” to mean something like solving P = NP. But in reality as soon as you step outside of your area of expertise most problems will be hard for you. I will give you some examples of things you might find to be hard problems (without knowing your background): - what is the correct way to frame a door into a structural exterior wall of a house with 10 foot ceilings that minimized heat transfer and is code compliant. - what is the correct torque spec and sequence for a Briggs and Stratton single cylinder 500 cc motor. - how to correctly identify a vintage Stanley hand plane (there were nearly two dozen generations of them, some with a dozen different types), and how to compare them and assess their value. - how to repair a cracked piece of structural plastic. This one was really interesting for me because I came up with about 5 approaches and tried two of them before asking an LLM and it quickly explained to me why none of the solutions I came up with would work with that specific type of plastic (HDPE is not something you can glue with most types of resins or epoxies and it turns out plastic welding is the main and best solution). What it came up with was more cost efficient, easier, and quicker than anything I thought up. - explaining why mixing felt, rust, and CA glue caused an exothermal reaction. - find obscure local programs designed to financially help first time home buyers and analyze their eligibility criteria. In all cases I was able to verify the solutions. In all cases I was not an expert on the subject and in all cases for me these problems presented serious difficulty so you might colloquially refer to them as hard problems.
	▲	aurareturn 2 days ago \| parent \| prev [-]
		It is not. It’s relative to the subject. In this case, the original author stated that AI only good for rewriting emails. I showed a much harder problem that AI is able to help me with. So clearly, my problem can be reasonably described as “hard” relative to rewriting emails.

▲

m4rtink 2 days ago | parent | prev | next [-]

If you have 200 line SQL queries you have a whole other kind of problem.

▲

r0x0r007 2 days ago | parent [-]

not unless you are working on todo apps.

	▲	hshdhdhehd 2 days ago \| parent [-]
		TODO: refactor the schema design.

▲

hshdhdhehd 2 days ago | parent | prev | next [-]

Problem with this is people will accept tech debt and slow query's so long as the LLM can make sense of it (allegedly!).

So the craft is lost. Making that optimised query or simplifying the solution space.

No one will ask "should it be relational even?" if the LLM can spit out sql then move on to next problem.

▲

aurareturn 2 days ago | parent [-]

So why not ask the LLM if it should be relational and provide the pros and cons?

Anyway, I'm sure people have asked if we should be programming in C rather than Assembly to preserve the craft.

	▲	GoatInGrey 2 days ago \| parent \| next [-]
		Surely you understand the difference between not knowing how to do anything by yourself and only knowing how to use high-level languages?
	▲	hshdhdhehd 2 days ago \| parent \| prev [-]
		That is like using the LLM like a book. Sure do that! But human still needs to understand and make the decisions.

▲

leptons 2 days ago | parent | prev [-]

What happens when these "AI" companies start charging you what it really costs to run the "AI"? You'd very likely balk at it and have to learn SQL yourself. Enjoy it while it lasts, I guess?

▲

enraged_camel 2 days ago | parent | prev [-]

I work with some very complex queries (that I didn't write), and yeah, AI is an absolute lifesaver, especially in troubleshooting situations. What used to take me hours now takes me minutes.

▲ praveen9920 2 days ago | parent | prev | next [-]

In my case, Learning new stuff is one place I see AI playing major role. Especially the academic research which is hard to start if you are newbie but with AI I can start my research, read more papers with better clarity.

▲ sumedh 2 days ago | parent | prev | next [-]

Which model are you using?

▲ Daz912 2 days ago | parent | prev [-]

Sounds like you're not capable of using AI correctly, user error.

	▲	leptons 2 days ago \| parent \| next [-]
		Sorry, I'm not taking a comment like this from a 2-hour old account seriously. You don't know me at all.
	▲	lompad 2 days ago \| parent \| prev [-]
		"It can't be that stupid, you must be prompting it wrong!" Sigh.

▲ DecentShoes 2 days ago | parent | prev | next [-]

Can you give some examples??

▲

jaggederest 2 days ago | parent [-]

Calculate the return on investment for a solar installation of a specified size on a specified property based on the current dynamic prices of all of the panels, batteries, inverter, and balance of system components, the current zoning and electrical code, the current cost of capital, the average insolation and weather taking into account likely changes in weather in the future as weather instability increases due to more global increase of temperature, the chosen installation method and angle, and the optimal angle of the solar panels if adjusted monthly or quarterly. Now do a Manual J calculation to determine the correct size of heat pump in each section of that property, taking into account number of occupants, insulation level, etc.

ChatGPT is currently the best solar calculator on the publicly accessible internet and it's not even close. It'll give you the internal rate of return, it'll ask all the relevant questions, find you all the discounts you can take in taxes and incentives, determine whether you should pay the additional permitting and inspection cost for net metering or just go local usage with batteries, size the batteries for you, and find some candidate electricians to do the actual installation once you acquire the equipment.

Edit: My guess is that it'd cost several thousand dollars to hire someone to do this for you, and it'll save you probably in the $10k-$30k range on the final outcomes, depending on the size of system.

▲

aprilthird2021 2 days ago | parent | next [-]

My God, the first example is having an AI do math, then he says "Well I trust it to a standard deviation"

So it's literally the same as googling "what's the ballpark solar installation cost for X in Y area" unbelievable, and people pay $20+ per month for this

	▲	2 days ago \| parent \| next [-]
		[deleted]
	▲	bdangubic 2 days ago \| parent \| prev [-]
		$200 :)

▲

m4rtink 2 days ago | parent | prev [-]

Any way to tell if the convincing final numbers it told you are real or halucinated ?

▲

visarga 2 days ago | parent | next [-]

Solve the same task with ChatGPT, Gemini and Claude. If they agree, you can be reasonably sure.

▲

caminante a day ago | parent [-]

I'm not opposed to experimenting, but that's a a recipe for false confidence in a final decision.

	▲	visarga 10 hours ago \| parent [-]
		Where they agree it shows the data supports that answer - not necessarily that it is true, where they disagree it shows you need to hedge. That's useful.

▲

jaggederest 2 days ago | parent | prev [-]

I checked them carefully myself with various other tools. It was using python to do the math so I trust it to a single standard deviation at least.

▲

mb7733 2 days ago | parent [-]

Standard deviation of what

▲

caminante 2 days ago | parent [-]

I'm lost too. Financials are technology agnostic.

They probably meant that they could read (and trace) the logic in Python for correctness.

▲

jaggederest 2 days ago | parent | next [-]

I wouldn't publish it as statistically significant but it's within the error bounds for a real human accomplishing the same task, to reword.

▲

caminante a day ago | parent [-]

I see your edit.

I would recommend spending that "couple thousand" for quote(s). It's a second opinion from someone who hopefully has high volume in your local market. And your downside could be the entire system plus remediation, fines, etc.

To be clear, I'm not opposed to experimenting, but I wouldn't rely on this. Appreciate your comment for the discussion.

	▲	jaggederest a day ago \| parent [-]
		No I'm not relying on it in the sense of going out and running the entire project through it, but as an accurate screener for whether it's worth doing, there's nothing comparable available.

▲

2 days ago | parent | prev [-]

[deleted]

▲ weregiraffe 2 days ago | parent | prev [-]

>Weird because AI has been solving hard problems for me.

Examples or it didn't happen.

▲ anonzzzies 2 days ago | parent | prev | next [-]

Well deep/hard is different I guess; I use it, day and night, for things I find boring. Boilerplate coding (which now is basically everything that's not pure business logic / logic / etc), corporate docs, reports etc. Everything I don't want to do is done by AI now. It's great. Outside work I use it for absolutely nothing though; I am writing a book, framework and database; that's all manual work (and I don't AI is good at any of those (yet)).

▲ IgorPartola 2 days ago | parent | prev [-]

As an LLM-skeptic who got a Claude subscription, the free models are both much dumber and configured for low latency and short dumb replies.

No it won’t replace my job this year or the next, but what Sonnet 4.5 and GPT 5 can do compared to e.g. Gemini Flash 2.5 is incredible. They for sure have their limits and do hallucinate quite a bit once the context they are holding gets messy enough but with careful guidance and context resets you can get some very serious work done with them.

I will give you an example of what it can’t do and what it can: I am working on a complicated financial library in Python that requires understanding nuanced parts of tax law. Best in class LLM cannot correctly write the library code because the core algorithm is just not intuitive. But it can:

1. Update all invocations of the library when I add non-optional parameters that in most cases have static values. This includes updating over 100 lengthy automated tests.

2. Refactor the library to be more streamlined and robust to use. In my case I was using dataclasses as the base interface into and out of it and it helped me split one set of classes into three: input, intermediate, and output while fully preserving functionality. This was a pattern it suggested after a changing requirement made the original interface not make nearly as much sense.

3. Point me to where the root cause of failing unit tests was after I changed the code.

4. Suggest and implement a suite of new automated tests (though its performance tests were useless enough for me to toss out in the end).

5. Create a mock external API for me to use based on available documentation from a vendor so I could work against something while the vendor contract is being negotiated.

6. Create comprehensive documentation on library use with examples of edge cases based on code and comments in the code. Also generate solid docstrings for every function and method where I didn’t have one.

7. Research thorny edge cases and compare my solutions to commercial ones.

8. Act as a rubber ducky when I had to make architectural decisions to help me choose the best option.

It did all of the above without errors or hallucinations. And it’s not that I am incapable of doing any of it, but it would have taken me longer and would have tested my patience when it comes to most of it. Manipulating boilerplate or documenting the semantic meaning between a dozen new parameters that control edge case behavior only relevant to very specific situations is not my favorite thing to do but an LLM does a great job of it.

I do wish LLMs were better than they are because for as much as the above worked well for me, I have also seen it do some really dumb stuff. But they already are way too good compared to what they should be able to do. Here is a short list of other things I had tried with them that isn’t code related that has worked incredibly well:

- explaining pop culture phenomenon. For example I had never understood why Dr Who fans take a goofy campy show aimed in my opinion at 12 year olds as seriously as if it was War and Peace. An LLM let me ask all the dumb questions I had about it in a way that explained it well.

- have a theological discussion on the problem of good and evil as well as the underpinnings of Christian and Judaic mythology.

- analyze in depth my music tastes in rock and roll and help fill in the gaps in terms of its evolution. It actually helped me identify why I like the music I like despite my tastes spanning a ton of genres, and specifically when it comes to rock, created one of the best and most well curated playlists I had ever seen. This is high praise for me since I pride myself on creating really good thematic playlists.

- help answer my questions about woodworking and vintage tool identification and restoration. This stuff would have taken ages to research on forums and the answers would still be filled with purism and biased opinions. The LLM was able to cut through the bullshit with some clever prompting (asking it to act as two competing master craftsmen).

- act as a writing critic. I occasionally like to write essays on random subjects. I would never trust an LLM to write an original essay for me but I do trust it to tell me when I am using repetitive language, when pacing and transitions are off, and crucially how to improve my writing style to take it from B level college student to what I consider to be close to professional writer in a variety of styles.

Again I want to emphasize that I am still very much on the side of there being a marketing and investment bubble and that what LLMs can do being way overhyped. But at the same time over the last few months I have been able to do all of the above just out of curiosity (the first coding example aside). These are things I would have never had the time or energy to get into otherwise.

▲

boggsi2 2 days ago | parent [-]

You seem very thoughtful and careful about all this, but I wonder how you feel about the emergence of these abilities in just 3 years of development? What do you anticipate it will be capable of in the next 3 years?

With no disrespect I think you are about 6-12 months behind SOTA here, the majority of recent advances have come from long running task horizons. I would recommend to you try some kind of IDE integration or CLI tool, it feels a bit unnatural at first but once you adapt your style a bit, it is transformational. A lot of context sticking issues get solved on their own.

▲

IgorPartola 2 days ago | parent [-]

Oh I am very much catching up. I am suing Claude Code primarily, and also have been playing a bit with all the latest API goodies from OpenAI and Anthropic, like custom tools, memory use, creating my own continuous compaction algorithm for a specific workflow I tried. There is a lot happening here very fast.

One thing that struck me: models are all trained starting 1-2 years ago. I think the training cutoff for Sonnet 4.5 is like May 2024. So I can only imagine with is being trained and tested currently. And also these models are just so ahead of things like Qwen and Llama for the types of semi-complex non-coding tasks I have tried (like interpreting my calendar events), that it isn’t even close.

	▲	boggsi2 2 days ago \| parent [-]
		[dead]