Technically there has only been one fatal accident in space, the Soyuz 11 failure which killed the crew of three. That occurred above the Karman line, all other spaceflight related fatalities were at much lower altitudes or on the ground.
Surely AGI would be matching humans on most tasks. To me, surpassing humans on all cognitive tasks sounds like superintelligence, while AGI "only" need to perform most, but not necessarily all, cognitive tasks at the level of a human highly capable at that task.
Personally I could accept "most" provided that the failures were near misses as opposed to total face plants. I also wouldn't include "incompatible" tasks in the metric at all (but using that to game the metric can't be permitted either). For example the typical human only has so much working memory, so tasks which overwhelm that aren't "failed" so much as "incompatible". I'm not sure exactly what that looks like for ML but I expect the category will exist. A task that utilizes adversarial inputs might be an example of such.
Thanks, I'll see about an on-page zoom. On my 1440p the whole table fits even with the side drawer open and with my webdev inexperience I didn't even think about zoom controls other than the browser's being an option.
I love your slide puzzle too. Very cool with different hint levels, where you can have just the element symbol or the full name as well. Surely trivial for chemists but not so for me.
Thanks! My brother is a chemical engineer so he's about the only one who can come close to solving the puzzles. :)
Also really like how the Timeline fades out the elements to filter by year.
Hmmm... I wonder if its a HiDPI scaling thing? I tried the site on a couple browsers (Safari, Chromium) and even on a 4K monitor it only fits Hydrogen to Nitrogen.
StackOverflow was great because it's not like a support forum or a mailing list. It's more like a repository of knowledge. It's been extremely helpful to me when arriving from Google, and I've gotten a couple useful responses to my own questions either. Awesome resource.
Where SO started failing in my opinion is when the "no duplicate questions" rule started to be interpreted as "it's a duplicate if the same or very similar question has ever been answered on the site". That caused too many questions to have outdated answers as the tech changes, best practices change and so on. C# questions have answers that were current for .NET Core 1.0 and should be modified. I have little webdev experience but I know JS has changed rapidly and significantly, so 2012 answers to JS questions are likely not good now.
This echoes my own experience. The very few times I attempted to post a question it was later flagged as duplicate, pointing to some other question which matched the keywords but didn't at all match the actual use case or problem. I don't know if this was the result of an automated process or zealous users, but it certainly put me off ever trying to engage with the community there.
> Where SO started failing in my opinion is when the "no duplicate questions" rule started to be interpreted as "it's a duplicate if the same or very similar question has ever been answered on the site".
What else could it mean? The entire point is that if you search for the question, you should always find the best version of that question. That only works by identifying it and routing all the others there.
> That caused too many questions to have outdated answers as the tech changes
You are, generally, supposed to put the new answer on the old question. (And make sure the question isn't written in a way that excludes new approaches. Limitations to use a specific library are generally not useful in the long term.)
Of course, working with some libraries and frameworks is practically like working in a different language; those get their own tags, and a question about doing it without that framework is considered distinct as long as everyone is doing their jobs properly. The meta site exists so that that kind of thing can be hashed out and agreed upon.
> C# questions have answers that were current for .NET Core 1.0 and should be modified.
No; they should be supplemented. The old answers didn't become wrong as long as the system is backwards-compatible.
The problem is mainly technical: Stack Overflow lacked a system to deprecate old answers, and for far too long the preferred sort was purely based on score. But this also roped in a social problem: high scores attract more upvotes naturally, and most users are heavily biased against downvoting anything. In short, Reddit effects.
> You are, generally, supposed to put the new answer on the old question
If you're asking the question, you don't know the new answer.
If you're not asking the question, you don't know the answer needs updating as it is 15 years old and has an accepted answer, and you didn't see the new question as it was marked as a dupe.
Even if you add the updated answer, it will have no votes and so has a difficult battle to be noticed with the accepted answer, and all the other answers that have gathered votes over the years.
I remain somewhat skeptical of LLM utility given my experience using them, but an LLM capable of validating my ideas OR telling me I have no clue, in a manner I could trust, is one of those features I'd really like and would happily use a paid plan for.
I have various ideas. From small scale stuff (how to refactor a module I'm working on) to large scale (would it be possible to do this thing, in a field I only have a basic understanding of). I'd love talking to an LLM that has expert level knowledge and can support me like current LLMs tend to ("good thinking, this idea works because...") but also offer blunt critical assessment when I'm wrong (ideally like "no, this would not work because you fundamentally misunderstand X, and even if step 1 worked here, the subsequent problem Y applies").
LLMs seem very eager to latch onto anything you suggest is a good idea, even if subtly implied in the prompt, and the threshold for how bad an idea has to be for the LLM to push back is quite high.
Have you tried actually asking for a detailed critique with a breakdown of the reasoning and pushback on unrealistic expectations? I've done that a few times for projects and got just what you're after as a response. The pushback worked just fine.
I have something like that in my system prompt. While it improves the model it's still a psychopathic sycophant. It's really hard to balance between it just going way too hard in the wrong direction and being overly nice.
The latter can be really subtle too. If you're asking things you don't already know the answer to it's really difficult to determine if it's placating you. They're not optimized for responding with objective truth, they're optimized for human preference. It always takes the easiest path and it's easy for a sycophant to not look like a sycophant.
I mean literally the whole premise of you asking it not to engage in sycophancy is it being sycophantic. Sycophancy is their nature
> I mean literally the whole premise of you asking it not to engage in sycophancy is it being sycophantic.
That's so meta it applies to everything though. You go to a business advisor to get business advice - are they being sycophantic because you expect them to do their work? You go to a gym trainer to push you with specific exercise routine - are they being sycophantic because you asked for help with exercise?
It's ultimately a trust issue and understanding motivations.
If I am taking to a salesperson, I understand their motivation is to sell me the product. I assume they know the product reasonably well but I also assume they have no interest in helping me find a good product. They want me to buy their product specifically and will not recommend a competitor. With any other professional, I also understand the likely motivations and how they should factor into my trust.
For more developed personal relationships of course there are people I know and trust. There are people I trust to have my best interests at heart. There are people I trust to be honest with me, to say unpleasant things if needed. This is also a gradient, someone I trust to give honest feedback on my code may not be the same person I trust to be honest about my personal qualities.
With LLMs, the issue is I don't understand how they work. Some people say nobody understands LLMs, but I certainly know I don't understand them in detail. The understanding I have isn't nearly enough for me to trust LLM responses to nontrivial questions.
Fair... but I think you're also over generalizing.
Think about how these models are trained. They are initially trained as text completion machines, right? Then to turn them to chatbots we optimize for human preferential output, given that there is no mathematical metric for "output in the form of a conversation that's natural for humans".
The whole point of LLMs is to follow your instructions. That's how they're trained. An LLM will never laugh at your question, ignore it, or any thing that humans may naturally do unless they are explicitly trained for that response (e.g. safety[0])
So that's where the generalization of the more meta comment breaks down. Humans learning to converse aren't optimizing for for the preference of the person they're talking to. They don't just follow orders, and if we do we call them things like robots or NPCs.
I go to a business advisor because of their expertise and because I have trust in them that they aren't going to butter me up. But if I go to buy a used car that salesman is going to try to get me. The way they do that may in fact be to make me think they aren't buttering me up.
Are they being sycophantic? Possibly. There are "yes men". But generally I'd say no. Sycophancy is on the extreme end, despite many of its features being common and normal. The LLM is trained to be a "yes man" and will always be a "yes man".
tldr:
Denpok from Silicon Valley is a sycophant and his sycophancy leads to him feigning non-sycophancy in this scene
https://www.youtube.com/watch?v=XAeEpbtHDPw
[0] This is also why jailbreaking is not that complicated. Safety mechanisms are more like patches and they're in an unsteady equilibrium. They are explicitly trained to be sycophantic.
Not just probably scared off some good talent, they had xoofx leave over disagreements with higher management. xoofx was one of their most senior devs, the guy who started the CoreCLR migration and was leading it.
They'll get there eventually, but the current roadmap says experimental CoreCLR in late 2026, which then in the best case means production ready in 2027. Unity isn't going anywhere, but at least as a dev who doesn't care about mobile (which is Unity's real market), competing engines have gotten much more attractive in the last couple years and that seems set to continue.
The funny thing about his resignation is that xoofx had a CoreCLR prototype already working around 2016-ish, but the company had "other priorities" and only took it seriously until recently.
The guy should just have been left alone and shielded from company bullshit to do the migration, or empowered to fight.
I know this is one sided but: Whoever from high-management lost this guy is an absolute loser waste of space who didn't do his job and will blame xoofx for “not fighting harder” or some other bullshit. Fuck companies, and fuck loser managers.
I think 2016 is a bit too early but yeah, xoofx first wrote about CoreCLR in 2018 and said he'd made considerable progress with something like himself and two other engineers doing it as a side project. That is four years before Unity as a company announced the migration as a priority, which in turn is another four years before the current estimate for when they may ship it.
From my perspective, Unity seems very poorly managed in recent years. The editor experience isn't improving while they continue the usual pattern of shipping features in a poor state where they need another couple versions to become properly usable, and of course they make terrible decisions like the runtime fee, a total insanity that caused a huge loss of trust and boosted Godot development enormously.
Of course my perspective is biased by me not being Unity's main target market. I work on PC strategy games, which are on Steam. At our studio, we don't do mobile, advanced graphics features aren't very relevant, and we may have the most complex UI that ever shipped in a Unity game.
I have minimal experience with modern Web tech, though I used to run a couple websites in the old days.
My main job currently is in game dev, writing C#. Working for a small studio with flexible roles, I sometimes also take the opportunity to use other tech, like actual web stuff. A couple years ago I wrote a simple HTTP API for some internal needs and that was the first time I did modern web.
I've worked in the embedded space and adjacent. I've done automotive (Autosar), I've done some bare metal applications and I've maintained a custom Linux system for a series of embedded products. I've also worked on tooling (native desktop applications) related to some of these embedded uses.
For fun I still play with some embedded development, and would like to do another Android app. I built a couple simple ones years ago and generally Android development seems pretty pleasant to me, but I haven't done Android side projects in a long time because I can't really think of any apps I'd actually like to have.
Perforce is standard in gamedev currently. As a programmer first and foremost, I prefer git but I've certainly come to appreciate the advantages of Perforce and it's an overall better fit for (bigger) game projects.
Yes, Perforce handles large files, and large folders of files, very well. It's quite efficient with deltas in binary files. It's also very useful that Perforce expects clients to check out only a part of the depot. There are folders with raw assets like uncompressed sound, layered graphics and all that, I don't check out any of that, I only check out the necessary in-engine assets.
For code, I prefer git as I said, but in a game's depot most files are not code, and Perforce is built around handling those other assets well.
Global and foreign news is a good thing. But a lot of attention devoted to foreign stories of questionable relevance isn't a good thing.
One of my usual news sources is SVT, the national Swedish broadcaster. Their svt.se website is good and, aside from Swedish news, they're also quick to cover major foreign events, if something is breaking news they'll have it up right away. But one of my main complaints is SVT covers local American crime too much. It pops up as one of the top "just in" headlines. I went to look just before replying here, and it's actually happening again right now - there's a "just in" headline "Four dead in a shooting in Mississippi". It's fast, I don't even see it on cnn.com as I'm typing this. But, with all respect to the victims, mass shooting are pretty much a daily event in the US and generally have no global importance.