Isn’t this basically the Swiss cheese model? If your two input AIs hallucinate, or your consensus AI misunderstands the input, you will still have confabulations in the output?
kuberwastaken 1 days ago [-]
From all my testing, this never really happened even once honestly, plus the judge model (that I've kept strictly a reasoning model) also evaluates individually before "judging" the consensus.
TheKelsbee 2 days ago [-]
I have this same thought, and have tried similar approaches.
OP: Have you trained or fine tuned a model that specifically reasons the worker model inputs against the user input? Or is this basically just taking a model and turning the temperature down to near 0?
kuberwastaken 1 days ago [-]
Low temperature, heavy prompting to answer in a structured way. Sadly can't fine train models since this is API based but the approach does work!
sks38317 2 days ago [-]
I’m genuinely interested in how you arrived at the concept of using AI as a method to treat hallucinations. What inspired that approach?
kuberwastaken 1 days ago [-]
Honestly, personal use cases. I am a STEM student and deal with a lot of "hard" questions that are about 60% of the time miscalculated by LLMs, I used to manually paste in approaches from say ChatGPT to DeepSeek and now grok and asked them what do you think is better. I created this out of necessity to automate this then realized how cool it can be if it scales further haha
Very interesting. Will this be available as a meta model via API, allowing use in the coding tool of my choice?
kuberwastaken 24 hours ago [-]
Eventually yes, that's the plan! It's extremely good with code too, especially with more vague requests, tends to take about 2-3 rounds but almost always gets a great approach.
shemulray667 2 days ago [-]
[flagged]
0m3g4_k1ng 21 hours ago [-]
[flagged]
kuberwastaken 20 hours ago [-]
Thank you? Haha
Rendered at 13:00:38 GMT+0000 (Coordinated Universal Time) with Vercel.
OP: Have you trained or fine tuned a model that specifically reasons the worker model inputs against the user input? Or is this basically just taking a model and turning the temperature down to near 0?