Comment Blame Gemini (Score 1) 42
Just waiting for some vibe coder at Google to say the Gemini AI wrote it.
Just waiting for some vibe coder at Google to say the Gemini AI wrote it.
Charter subscriber to Byte Magazine here. I built my Altair 8800 from the Popular Science article. Article left out the First Annual World Altair Computer Convention, March 1976 in Albuquerque, New Mexico,
It is Claude 4 Opus the top model publicly available.
Without credit card you can get 50 prompts.
With credit card you can get 14 day free trial of the $50 600 prompt account.
Now get the judge in the Google anti-trust trial to order Google to create this are you're done. I don't think the proposal is unreasonable, someone just needs to get it in front of the right people. Google would certainly prefer doing this instead of selling off Chrome.
The non-profit controlling the pool can negotiate the user fees based on usage. As for people who bypass the pooled crawler -- every web site should let them make requests and then never respond to those request effectively keeping them in infinite timeouts. Public embarrassment of the bypassing entities will also help control this.
There is a good solution to prevent gaming this. The crawler can use AI to assess whether it wants to pay the price the page is asking or not. It can always decide the price is to high and not add the page to the index. In that case it doesn't pay. The payment is not for crawling, it is for giving permission to be added to the global index.
Another option would be to let each page set a micro payment amount in it's headers. Then the crawler could crawl until they run out of money. This works as a double-edged sword. If you set your micro payment amount too high you are not going to get crawled and then you'll drop out of every search index. So it's your choice. The single crawler would crawl free pages first and then crawl from cheapest to most expensive until it runs out of money. Obviously if you set your micro payment at $100 you're never going to get crawled and you'll never appear in another search engine either.
I would like to see the Google anti-trust trial solve this. A good solution is to have a single crawler for the web (can be Google's via anti-trust settlement) and then everyone pays into a pool to get access to the feeds from that single crawler. Payment into that pool can then be used to make the equivalent of statutory royalty payments to the sites crawled. If you don't want to be crawled put your stuff behind a login. Of course you are going to be sorely disappointed in the amount you get from those statutory payments simply because of the sheer number of web pages (around 47B). Optimistically you might be looking at $0.10 a page/yr.
Claude 4 Opus is far better than previous models for code generation. But none of this can be used without considerable code review.
It is not at all deterministic. That's because all of the tools I am using are under constant development. If everything in my environment stopped changing I suspect it would be deterministic, but that's never going to happen.
Hallucination is somewhat under your control there are three main sources 1) you exceed the context window of the AI model. For that one you just need to learn the limits of your model and not exceed them. Don't ask it to do a refactor which is going to touch two million lines of code. Two million lines of code won't fit into the context window so it's going to hallucinate substitutions for what doesn't fit. 2) It will hallucinate when your prompt is not specific enough and it fills in the blanks on its own. You can also control that by making very specific prompts. 3) Sometimes it just goes off the rails. For that one you have to closely monitor the terminal tracking what the agent is doing and if it goes off into the weeds, stop it and explain how to get back on track. That's similar to what you need to do with junior programmers.
I would say I am not ending up with any hallucinations (that I am aware of) in the code I am generating, but.. that's because I am very closely watching everything it is doing and I test and review everything manually before committing. If you're a vibe coder who gives a prompt and then goes off to lunch while it works, you're going to have problems and end up throwing away massive amounts of generated code. The second you stop reviewing what it is doing, you are doomed because you will accumulate piles of code you don't understand and can't fix.
What turns a junior programmer into a senior on is experience dealing with problems of ever increasing complexity. Currently there is an upper bound to the complexity the model can handle; I'd estimate it at around a 25 year old programmer. It is going to take AGI to replace the senior developers.
Consider that I have been feeding the AI tasks like this for a month now on this project. I am using it to write a complete Android app. It is up to 752 files and 110,000 lines of code. Looking at my history I can see that I have given it 30-50KB prompts about 35 times. That sort of implies that a 1.5MB prompt is needed to get to where I am at currently. So where would that 1.5MB prompt come from? That's the part it can't do yet. I'd have to spend a couple weeks working with the AI on that prompt which is a far more manual process than watching it execute the prompt. Then there is the part where I have to continuously supervise it or it will wander off on a tangent. The key bit here is, I know what a tangent is, and it doesn't.
So I suspect it is going to happen, but it is going to take AGI. The task of taking input from a VP and marketing and turning it into a product is far too unstructured and nebulous for today's AIs.
Note that I had spent an hour working with the AI to write the prompt I used at 778 lines, 29,953 bytes. And I still had to intervene a dozen times while it worked to keep it from going off on a tangent. Six months ago it wasn't even possible to do something like this.
I have abandoned Windsurf and switched to Augment. Windsurf is gen 1, Augment is gen 2.
I have nothing to do with either company, and i will switch again if I discover a better tool.
90 minutes and it is still chugging away fixing the bugs. I had to intervene two more times when it went off-the-rails.
Up to 4,415 lines of code touched.
Checked back by, it's been 75 minutes. The app loads and a reasonable screen appears. It has multiple obvious bugs so I tell Claude what to fix.
So far it has touched 57 files and and modified 3,676 lines of Kotlin code with Jetpack Compose.
The trouble with being punctual is that people think you have nothing more important to do.