ChatGPT gave the PAES and here are the results

To test the responsiveness offered by this tool created by OpenAI, it was tested with the standardized exam for access to Higher Education. This artificial intelligence still “hallucinates” certain questions and makes mistakes.

If the artificial intelligence get a perfect score PAES ? The tool that has revolutionized everyone over the past year has been put to the test with the Higher Education Admissions Test to determine one’s ability to understand and solve a problem in reading comprehension, history, mathematics or science. The results showed something surprising.

New updates of ChatGPT, which can now “see” and answer questions with graphs and other shapes that he couldn’t before, also made him receive questions that he had become less intelligent about. . How might this affect your ability to respond to the PAES?

This is what he tried to verify The EvoAcademy team, a digital academy dedicated to developing the latest trends in technology, artificial intelligence and digital marketing. The company tested this tool before the PAES. ChatGPT got between 81.3% and 96% correct answers. The lowest score he got was 745 in the physics test and the highest was 918 in the reading proficiency test.

Using the tool WebPlotDigitizer , further extracted the estimated densities of the PAES score distributions reported in the graphs by DEMRE. With this information, it was possible to estimate that ChatGPT would have been in the top 4% of those who took the written comprehension test, which was their best test.

Results of the GPT chat when passing the PAES. Credit: EvoAcademy.

“We were used to technology being very educational and responsive. And huge language models, like ChatGPT, don’t work like that. “yes”, he comments Sebastián Cisterna, CEO of Evoacademy . In reality, these technologies predict what the next best word is based on the information they have. “For example, if I say ‘a shrimp falls asleep, the…’, the AI ​​would tell me, based on that context, that what is happening is ‘the current’. This is the function that they mainly fill. And for the same reason, every time you give it an instruction, it will try to say what is the most likely word or concept that will come next,” he adds.

ChatGPT took the PAES and didn’t get a perfect score: why was this AI fake?

So that makes that ChatGPT is very consistent when speaking, but sometimes these answers are not always true . This phenomenon is known as “hallucination.”

Hallucinations are part of generative artificial intelligence, like a GPT chat or similar, and these in particular have this problem. “We can reduce this by modifying part of the code and thus be able to reduce the number of these hallucinations, but they can never be reduced to zero,” explains Cisterna. . According to him, this characteristic of hallucinating is something inherent to the program.

Now that the new update of this technology makes it possible to “see” images and interpret them, ChatGPT was better able to answer questions related to math problems associated with graphs and charts. Although it is still difficult for him to interpret the illustrations of the biology test and the more complex images.

“For example, in mathematics there are several geometry problems, and in this case he solved them very well, obviously with the respective allusions. What surprised us was that in science he didn’t interpret them as well. It seems that deep down, science is a branch that is a little more complex in interpreting the numbers that are there. he adds.

In any case, the artificial intelligence expert says that it is not something that can be generalized, because they have done very well in mathematics. This can be fixed, by generating a more personalized instance made in GPT and more accustomed to these issues and this is what we call “fine tuning” . “I think we could possibly make a PAES-GPT so to speak, and be able to push their specific knowledge on the subject a little further,” he describes.

If we adjusted him and did this fine-tuning process, to specifically be able to teach him a little bit more about the PAES, we would probably get a better result. Likewise, Cisterna details that due to her percentage of hallucinations, This tool may be most useful for a student who is at or below the average grade achieved.

“We expected it to go a little better than that, because based on previous experiences it had gone better. In the tests we carried out in April, together with other tests published by Demre, the results were better,” he notes. For this, it must be considered that In the previous tests, the interpretation of images had not been taken into account, which is why it was only tested in the reading comprehension and application of written knowledge exams.

Surrender of the PAES. Photo: Uno Agency.

Finally, Cisterna comments that the ChatGPT test was carried out both in its free version and also in the paid version. It was deduced that the subscription version had higher performance than the free access version. Performance improved between 15% and 20% when passing PAES with the paid version. Considering this, the CEO of Evoacademy explains that this happens because the free version still works with the GPT3.4 version, while the paid version already has GPT4.0 updates.

Source: Latercera

Related articles

Comments

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Share article

Latest articles

Newsletter

Subscribe to stay updated.