π
<-

Integration tests on 3 top-model CAS systems

Discussions scientifiques et scolaires

Integration tests on 3 top-model CAS systems

Unread postby quinyu » 07 Jun 2015, 22:43

311 indefinite integration problems were tested on TI-Nspire CX CAS, HP Prime and Casio ClassPad II emulators, with latest OSes available to us. The quick summary:

  • The Casio ClassPad II solved 69% of the integrals correctly, the TI-Nspire CX CAS 71% and the HP Prime 81%.
  • Using a confidence level of p=0.95, it is safe to state that the HP Prime performed significantly better than the other two calculators tested.
  • We would love if the respective manufacturers would fix the issues.

The detailed report can be found here: http://tiplanet.org/modules/archives/download.php?id=251888

No funding of any sort was received for this ongoing test. We claim no conflict of interest. No rabbits were harmed in the procedure. Yet. ;~)
User avatar
quinyu
Niveau 3: MH (Membre Habitué)
Niveau 3: MH (Membre Habitué)
Level up: 48%
 
Posts: 9
Joined: 07 Jun 2015, 20:06
Gender: Male
Calculator(s):
MyCalcs profile
Class: low

Re: Integration tests on 3 top-model CAS systems

Unread postby Excale » 07 Jun 2015, 22:50

Nice :).

You put the result in red when it was wrong or unsuccessful. Did you also count the number of wrong (1+1 -> 3 is wrong; 1+2 -> 4-1 is correct (although it's not what you want) answer for each calculator?
User avatar
ExcaleAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Level up: 3.9%
 
Posts: 2955
Images: 3
Joined: 10 Sep 2010, 00:00
Gender: Male
Calculator(s):
MyCalcs profile

Re: Integration tests on 3 top-model CAS systems

Unread postby quinyu » 07 Jun 2015, 23:03

As long as the derivative of the answer given by the calculator is identical to the expression that was integrated, it was counted as correct; otherwise (or if the calculator froze/rebooted/started doing weird things) as a fail.

There is no one integration result (as for example, you can rewrite the hyperbolic functions in terms of logarithms, and that's just one example out of hundreds), but they should come to the same derivative all the same (that is: given Int(f(x),x)=g(x), and f(x)-deriv(g(x),x)=0, then it's good. Simplifications and rewrites were taken into account.) If not, then the integration is wrong. Luckily, finding a derivative (like the checking requires) is much simpler and quicker than integrating (this can be proven; less simple on complex numbers, but still).

I have in some places used blue as well (spot them all and figure what was meant :P)

So answering your question: 1+1 -> 2 was considered as correct, just like 1+2 -> 4-1. At places I complained about the bulkiness of the results (don't we all?), but as long as it was a closed form and could be shown to give the same derivative, they were accepted.
User avatar
quinyu
Niveau 3: MH (Membre Habitué)
Niveau 3: MH (Membre Habitué)
Level up: 48%
 
Posts: 9
Joined: 07 Jun 2015, 20:06
Gender: Male
Calculator(s):
MyCalcs profile
Class: low

Re: Integration tests on 3 top-model CAS systems

Unread postby Adriweb » 07 Jun 2015, 23:04

Nice document indeed!

Bernard Parisse is reading TI-Planet so I'm sure he'll stumble upon this topic sooner or later, but in the meantime I'm going to share this to TI, maybe it can be helpful to improve the CAS engine :)

(one more thing : it would have been fun to add Wolfram Mathematica as another CAS engine, it probably would have obliterated all 3 calcs :P)

MyCalcs: Help the community's calculator documentations by filling out your calculators info!
MyCalcs: Aidez la communauté à documenter les calculatrices en donnant des infos sur vos calculatrices !
Inspired-Lua.org: All about TI-Nspire Lua programming (tutorials, wiki/docs...)
My calculator programs
Mes programmes pour calculatrices
User avatar
AdriwebAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Level up: 79.2%
 
Posts: 14778
Images: 1123
Joined: 01 Jun 2007, 00:00
Location: France
Gender: Male
Calculator(s):
MyCalcs profile
Twitter: adriweb
GitHub: adriweb

Re: Integration tests on 3 top-model CAS systems

Unread postby Excale » 07 Jun 2015, 23:06

I was more thinking about putting it in red when it answered with an integral.

It's a fail, but not a wrong result.

And... I really prefer to get a fail over a wrong result.

Edit:
The other fun facts is that if you combine all 3 calcs, you get a very good result. So: buy all of them :P.
User avatar
ExcaleAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Level up: 3.9%
 
Posts: 2955
Images: 3
Joined: 10 Sep 2010, 00:00
Gender: Male
Calculator(s):
MyCalcs profile

Re: Integration tests on 3 top-model CAS systems

Unread postby quinyu » 07 Jun 2015, 23:22

As of the TI, I don't know, I keep on submitting these stuffs (about once every 100 integrals covered) to TI as well as Casio (couldn't find a mail address for HP yet); and as of Wolfram Mathematica, you would be surprised. It can sometimes go very wrong. RUBI is my choice of integrator there.

As of a partial result - I still count it as a bad thing since the calculator throws the bit that was ultimately unsolvable for it back on us. It would remain that way.

Done the statistics check: 33 of the 311 problems were uncrackable for any of the three calculators. That's about a tenth of the problems. And currently I'm pretty damn fine with my FX-991DE plus - not to mention programmable and graphical calcs are not permitted in my school, at least not in the tests. No reason to invest in any of the three. Maybe later.
User avatar
quinyu
Niveau 3: MH (Membre Habitué)
Niveau 3: MH (Membre Habitué)
Level up: 48%
 
Posts: 9
Joined: 07 Jun 2015, 20:06
Gender: Male
Calculator(s):
MyCalcs profile
Class: low

Re: Integration tests on 3 top-model CAS systems

Unread postby Adriweb » 07 Jun 2015, 23:28

quinyu wrote:As of the TI, I don't know, I keep on submitting these stuffs (about once every 100 integrals covered) to TI as well as Casio (couldn't find a mail address for HP yet);

For TI, I (and some more people here) happen to know some TIers directly, so we can report bugs etc. directly to them (instead of going through TI-Cares etc.).
For HP, the CAS engine of the Prime is giac/xcas, which is developed by Bernard Parisse, which is a member of this forum :)
And I don't know about Casio.

By the way, maybe I skipped/didn't see it in the .pdf but did you use the student software for the Nspire tests, or an actual device ?
I have actually developed an REPL for the Nspire's CAS (even though it is not public yet), which allows to tests several dozens (hundreds?) of calculations per second, actually (well, it can work either as a REPL, or take input from a file and the output will be in stdout). That would probably help for a test suite.
And I suppose that having some kind of a repl/commandline interface for giac is trivial to get.

MyCalcs: Help the community's calculator documentations by filling out your calculators info!
MyCalcs: Aidez la communauté à documenter les calculatrices en donnant des infos sur vos calculatrices !
Inspired-Lua.org: All about TI-Nspire Lua programming (tutorials, wiki/docs...)
My calculator programs
Mes programmes pour calculatrices
User avatar
AdriwebAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Level up: 79.2%
 
Posts: 14778
Images: 1123
Joined: 01 Jun 2007, 00:00
Location: France
Gender: Male
Calculator(s):
MyCalcs profile
Twitter: adriweb
GitHub: adriweb

Re: Integration tests on 3 top-model CAS systems

Unread postby quinyu » 07 Jun 2015, 23:36

Only software for all three. But since I don't like time limited options, that's kArmTI running there for the TI. Casio and HP run the manufacturer-released emus (with Casio running virtualised.) And as of hundreds of calculations per second - most of the time is the actual typing, so it's nice but wouldn't help me much. Thanks for mentioning anyhow. Casio's emu is lagging one release behind as compared to the real deal, but since I found no way to insert the new OS, I'm letting it hang in the air for now.
User avatar
quinyu
Niveau 3: MH (Membre Habitué)
Niveau 3: MH (Membre Habitué)
Level up: 48%
 
Posts: 9
Joined: 07 Jun 2015, 20:06
Gender: Male
Calculator(s):
MyCalcs profile
Class: low

Re: Integration tests on 3 top-model CAS systems

Unread postby Bisam » 07 Jun 2015, 23:39

I can see many issues, in the paper :
  • Why is TI Nspire's answer for #18 not accepted ? I suppose that it is because of automatic verification... but it is perfectly correct.
  • The same for #39...
  • #50 is counted wrong for both Nspire and Classpad... when it is correct for both. Why is that ?
  • #52 and #53 are again correct for Nspire but counted as wrong
  • #68 is counted as a fail where it should be the best answer !! the other two answers are wrong when n is -1. However, the Nspire doesn't give an answer even if n is specified to be positive for example.

I didn't run through all answers but I'd like to know the reasons for excluding some good answers...
User avatar
BisamAdmin
Niveau 15: CC (Chevalier des Calculatrices)
Niveau 15: CC (Chevalier des Calculatrices)
Level up: 69.6%
 
Posts: 5670
Joined: 11 Mar 2008, 00:00
Location: Lyon
Gender: Male
Calculator(s):
MyCalcs profile

Re: Integration tests on 3 top-model CAS systems

Unread postby Adriweb » 07 Jun 2015, 23:45

quinyu wrote:Only software for all three. But since I don't like time limited options, that's kArmTI running there for the TI.

I can't help but mention Firebird Emu, now that it's out :)

Casio and HP run the manufacturer-released emus (with Casio running virtualized.)

Note: the official software(s) are actually simulators, not emulators (they [try to] reproduce the software's behaviour (by compiling the source code for the desktop architecture and not the calc's), not the hardware, like an emulator would do)

quinyu wrote:And as of hundreds of calculations per second - most of the time is the actual typing, so it's nice but wouldn't help me much. Thanks for mentioning anyhow.

Well, that would precisely allow you to have all the tests in a .txt file, and running all the tests comparing the output to the expected result would allow you to get test results in a matter of seconds, that's infinitely faster than typing every single one by hand and comparing the output manually :P

MyCalcs: Help the community's calculator documentations by filling out your calculators info!
MyCalcs: Aidez la communauté à documenter les calculatrices en donnant des infos sur vos calculatrices !
Inspired-Lua.org: All about TI-Nspire Lua programming (tutorials, wiki/docs...)
My calculator programs
Mes programmes pour calculatrices
User avatar
AdriwebAdmin
Niveau 16: CC2 (Commandeur des Calculatrices)
Niveau 16: CC2 (Commandeur des Calculatrices)
Level up: 79.2%
 
Posts: 14778
Images: 1123
Joined: 01 Jun 2007, 00:00
Location: France
Gender: Male
Calculator(s):
MyCalcs profile
Twitter: adriweb
GitHub: adriweb

Next

Return to Maths, physique, informatique et autre...

Who is online

Users browsing this forum: ClaudeBot [spider] and 10 guests

-
Search
-
Social TI-Planet
-
Featured topics
Grand Concours 2024-2025 - Programmation Python
Comparaisons des meilleurs prix pour acheter sa calculatrice !
"1 calculatrice pour tous", le programme solidaire de Texas Instruments. Reçois gratuitement et sans aucune obligation d'achat, 5 calculatrices couleur programmables en Python à donner aux élèves les plus nécessiteux de ton lycée. Tu peux recevoir au choix 5 TI-82 Advanced Edition Python ou bien 5 TI-83 Premium CE Edition Python.
Enseignant(e), reçois gratuitement 1 exemplaire de test de la TI-82 Advanced Edition Python. À demander d'ici le 31 décembre 2024.
Aidez la communauté à documenter les révisions matérielles en listant vos calculatrices graphiques !
12345
-
Donations / Premium
For more contests, prizes, reviews, helping us pay the server and domains...
Donate
Discover the the advantages of a donor account !
JoinRejoignez the donors and/or premium!les donateurs et/ou premium !


Partner and ad
Notre partenaire Jarrety Calculatrices à acheter chez Calcuso
-
Stats.
1198 utilisateurs:
>1146 invités
>46 membres
>6 robots
Record simultané (sur 6 mois):
6892 utilisateurs (le 07/06/2017)
-
Other interesting websites
Texas Instruments Education
Global | France
 (English / Français)
Banque de programmes TI
ticalc.org
 (English)
La communauté TI-82
tout82.free.fr
 (Français)