10 Ways GPT-4 Is Awesome But Still Flawed News-thread


The system seemed to respond appropriately. But the answer did not take into account the height of the doorway, which could also prevent a tank or car from passing through.

OpenAI CEO Sam Altman said the new bot could reason “a little bit.” But his reasoning abilities fall apart in many situations. The older version of ChatGPT handled the question a bit better because it recognized that height and width mattered.

OpenAI said the new system could score among the top 10 percent of students on the Uniform Bar Exam, which rates lawyers in 41 states and territories. You can also score a 1,300 (out of 1,600) on the SAT and a five (out of five) on high school AP tests in biology, calculus, macroeconomics, psychology, statistics, and history, based on company tests. .

Earlier versions of the technology failed the Even Bar Exam and did not score as high on most Advanced Placement tests.

On a recent afternoon, to demonstrate his testing skills, Mr. Brockman gave the new bot a paragraph-long slash quiz question about a man who runs a diesel truck repair business.

The answer was correct but full of legalese. So Mr. Brockman asked the bot to explain the answer in layman-friendly language. He also did that.

Although the new bot seemed to reason about things that already happened, it was less adept when asked to hypothesize about the future. It seemed to be based on what others have said rather than creating new conjectures.

When Dr. Etzioni asked the new bot, “What are the important problems to solve in NLP research over the next decade?” She — referring to the kind of “natural language processing” research that drives the development of systems like ChatGPT — she couldn’t formulate entirely new ideas.

The new bot still invents things. Called “hallucination,” the problem haunts all leading chatbots. Because the systems do not have an understanding of what is true and what is not, they can generate text that is completely false.

When asked for the addresses of Web sites describing the latest cancer research, he sometimes generated Internet addresses that did not exist.


Please enter your comment!
Please enter your name here