|
Post by Pjot on Feb 4, 2015 20:36:09 GMT 1
All, By selecting the temporary return buffer in a smarter way, the performance for repetitive string operations has gone up to almost 40%. Suppose we have the following program: BLA$ = ""
FOR i = 1 TO 50000 BLA$ = BLA$ & "@" NEXT
Previously, on my laptop the timing would be this: Now with the new version: This is a difference of 100-(170/280)*100 = 39.3%. The latest version is available from the usual location. BR Peter
|
|
|
Post by Pjot on Apr 4, 2015 22:30:05 GMT 1
All, After a long time of redesigning the string handling routines, and trying different approaches in the implementation, the final solution now is ready. The string buffer handling itself already was improved, which also lead to a code overhaul of the string functions. The actual buffer selection procedure has been simplified in such a way that it lowers the memory footprint, increases the nested string handling performance and improves the code maintainability on the long term. A quick performance test between 5 different BASIC's shows the following result (using the test program mentioned earlier): The latest release can be found at the usual location. BR Peter EDIT added Gambas BASIC, X11 Basic and LibreOffice BASIC. I created a macro without a document and executed it from the command line: PS: anybody knows where the Linux version of BCX has gone to?
|
|
|
Post by jcfuller on Apr 12, 2015 15:48:25 GMT 1
|
|
|
Post by Pjot on Apr 13, 2015 18:45:03 GMT 1
Thanks James,
I'll give it a try.
BR Peter
|
|
|
Post by Pjot on Jan 26, 2016 20:22:03 GMT 1
All, Lately, I have been working on string performance again. I have tested several ideas, to a various degree of success. During these tests, another idea popped up which was relatively simple to implement. It comes down to a 'pointer-swap' of the intermediate result buffer with the target string buffer. Using this technique there is no need to perform memory copying which is expensive in terms of performance. Let's see what happens when using the above program: BLA$ = ""
FOR i = 1 TO 50000 BLA$ = BLA$ & "@" NEXT
The first 3.x releases of BaCon needed 0.280 seconds, which, after rewriting the intermediate buffer implementation, was brought down to 0.170 seconds. But with this new improvement we can see the following (using the same test machine and taking the average result after multiple runs): This is another improvement of 100-(110/170)*100 = 35.3%. The latest beta can be obtained from here. BR Peter
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Jan 27, 2016 2:59:41 GMT 1
Hi Peter,
I ran your bench program using Script BASIC on my old Toshiba laptop.
Note: For those unfamiliar with Script BASIC, it's a cross platform console mode interpreter written in ANSI C by Peter Verhas.
BLA$ = ""
FOR i = 1 TO 50000 BLA$ &= "@" NEXT PRINT LEN(BLA$),"\n"
jrs@laptop:~/sb/sb22/test$ time scriba bench.sb 50000
real 0m0.133s user 0m0.125s sys 0m0.008s jrs@laptop:~/sb/sb22/test$
Here is the new BaCon 3.3 version on my laptop.
BLA$ = ""
FOR i = 1 TO 50000 BLA$ = BLA$ & "@" NEXT PRINT LEN(BLA$),"\n"
jrs@laptop:~/BaCon/bacon-3.3$ bacon bench.bac Converting 'bench.bac'... done, 7 lines were processed in 0.005 seconds. Compiling 'bench.bac'... cc -c bench.bac.c cc -o bench bench.bac.o -lbacon -lm Done, program 'bench' ready. jrs@laptop:~/BaCon/bacon-3.3$ time ./bench 50000
real 0m0.229s user 0m0.220s sys 0m0.004s jrs@laptop:~/BaCon/bacon-3.3$ bacon -v
BaCon version 3.3 beta - (c) Peter van Eerten - MIT License.
jrs@laptop:~/BaCon/bacon-3.3$
This is creating a 1.000.000 character @ string.
jrs@laptop:~/BaCon/bacon-3.3$ time ./bench 1000000
real 2m3.698s user 2m3.512s sys 0m0.048s jrs@laptop:~/BaCon/bacon-3.3$ cd ~/sb/sb22/test jrs@laptop:~/sb/sb22/test$ time scriba bench.sb 1000000
real 1m26.229s user 1m5.785s sys 0m20.309s jrs@laptop:~/sb/sb22/test$
|
|
|
Post by Pjot on Jan 27, 2016 12:47:40 GMT 1
Hi jrs,
Well, this is not what my ScriptBasic is showing with the original program.
Assuming this test program:
BLA$ = ""
FOR i = 1 TO 50000 BLA$ = BLA$ & "@" NEXT PRINT LEN(BLA$), "\n"
When executed unmodified with ScriptBasic I can see the following:
While using BaCon 3.3, I can see this:
However, you rewrote the program to:
BLA$ = ""
FOR i = 1 TO 50000 BLA$ &= "@" NEXT PRINT LEN(BLA$),"\n"
And then the performance for Scriptbasic gets better, reason being that it the "&=" operator is optimized for one particular situation:
But you're changing the code (cheating!) and this is not the test. I can also modify the program in such a way that it performs better for BaCon, as follows:
BLA$ = FILL$(50000, ASC("@"))
PRINT LEN(BLA$)
Instead of using "&=" operator, I am using the FILL$ function which is optimized for this situation. Now my performance is:
BR Peter
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Jan 27, 2016 19:12:59 GMT 1
You took that explanation to the absolute edge. Too funny. I rarely use a this = this & that and use &= with my string appends.
Here is your original on my old laptop.
BLA$ = ""
FOR i = 1 TO 50000 BLA$ = BLA$ & "@" NEXT PRINT LEN(BLA$), "\n"
jrs@laptop:~/sb/sb22/test$ time scriba peterbench.sb 50000
real 0m1.863s user 0m1.795s sys 0m0.004s jrs@laptop:~/sb/sb22/test$
Are you still using a 32 bit version of scriba on Linux?
|
|
|
Post by Pjot on Jan 27, 2016 20:25:42 GMT 1
jrs,
You can try to make me look ridiculous, but you know d*** well you have modified the actual benchmark program (which is clearly mentioned in the above posts, multiple times).
And accidentally, with this change, all of a sudden your advertised ScriptBasic interpreter seems to be 'faster'.
Please abstain from deceptive posts like this and try to be clear and straight forward.
Thank you, Peter
PS I am not using the 32-bit version as I am not using ScriptBasic at all - not for a long time, and for reasons you already know.
|
|
|
Post by alexfish on Feb 4, 2016 21:41:43 GMT 1
Interesting..
these bench marks , things
also Interesting.. from Quote of the day
BR Alex
|
|
|
Post by Pjot on Feb 5, 2016 21:09:38 GMT 1
Hi Alex, Well, I found where this comes from, and it's too bad that jrs continues his efforts to paint me black... But we know his character longer than today, so it simply is a waste of time to spend more words, talk less of responding to his insinuations and allegations (I do not owe him an explanation anyway - who does he think he is, my wife? ). So, for the string performance in BaCon, it started with my comparison to other languages, where I found that even though BaCon converts to C and compiles to a binary, its string operations are relatively slow. This finding got worse when looking into C libraries which optimize string operations. How come that, for example, a simple SDS based program shows such a fast performance? PRAGMA INCLUDE sds-master/sds.h sds-master/sds.c
DECLARE myvar TYPE sds
myvar = sdsnew("")
FOR i = 1 TO 20000 myvar=sdscat(myvar, "@") NEXT
Further studying reveals that the above program implements a special kind of concatenation, which does not take into account the properties for a generic string concatenation, like variable arguments, nesting, different string types etc. all without memory leaks. It is a sum of all these special properties, in combination with the use of plain C character arrays, which causes string operations to become slow. A BASIC implementation like FreeBASIC, which converts to assembly, will always beat BaCon, simply because of its usage of low-level machine code. The FreePASCAL compiler beats BaCon because of the way it constructs its strings. They use a propriety binary layout which is not compatible with plain libc calls like 'printf' or 'strcat', which prevents us from using the same binary layout in BaCon. Nevertheless, I could optimize the BaCon implementation in such a way that string operation performance has gone up more than 60% overall. So in the meantime, I am looking into some other ideas to get string operations faster in the generated C code. As a matter of fact, this is what I have been checking out in my spare time for the last couple of weeks (hence my silence on the forum). Not sure if these ideas are feasible, but if so, I'll post the results here. BR Peter
|
|
|
Post by jcfuller on Feb 6, 2016 11:08:25 GMT 1
Peter, I believe the latest FreeBasic 64 translates to c not asm? Any testing done with this one?
Would using Glib's string be an alternative?
James
|
|
|
Post by Pjot on Feb 6, 2016 12:00:35 GMT 1
Hi James, Regarding FreeBASIC, that could be, I will take another look at their implementation to see how they do it. I know it used to be ASM once - I have downloaded their latest 1.05 now and performed some tests (results below). So for the benchmark, I have changed it somewhat, setting a higher value for the loop: BLA$ = ""
FOR i = 1 TO 200000 BLA$ = BLA$ & "@" NEXT
PRINT "BaCon"
bla$ = ""
for i = 1 to 200000 bla$ = bla$ + "@" next
rem print len(bla$) print "Free Basic"
For Awk, Perl, and Python I used the same program as before (also using the higher value of 200000). The graph below leaves out Basic versions like sdlBasic, YaBasic, ScriptBasic, BWBasic, X11Basic and LibreOffice Basic, because their results simply are outside scale. For your 'UbxWx', it created a C program all right, but it crashed when executing the 'strcat' in the inner loop (no memory allocated) - probably I am doing something wrong there? If you have a suggestion on the bench program for BCX I am happy to test it as well. Anyway, though the result for BaCon is getting better, there's still some homework to do BR Peter Attachments:
|
|
|
Post by jcfuller on Feb 6, 2016 17:19:59 GMT 1
Peter, UbxWx has a default size of 2048. It also uses strcat so it takes a LONG time. What are you using to time it. James
'=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=* 'Peter string test '=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=* $WXC 'Linux $ONEXIT "~/UbxWx/comp.sh $FILE$" '============================================================================== '============================================================================== Function main() As Integer Dim Bla$ * 200001 Dim i As Integer For i = 1 To 200000 Bla$ = Bla$ & "@" Next Print "UbxWx" Function = 0 End Function '==============================================================================
|
|
|
Post by Pjot on Feb 6, 2016 19:52:03 GMT 1
Thanks James, I have commented the line with $ONEXIT and converted/compiled your program. Please let me know if I did something wrong. Multiple runs deliver a result around the 11.520s to 11.540s. Like you already mentioned, the use of 'strcat' probably is the culprit here. In my implementation, I have replaced almost all functions from <string.h> to their mem equivalent (memcpy, memmove). This already should give a huge performance increase. One interesting thing is the resulting binary size. If I strip all binaries, and compare their sizes, BaCon has the smallest resulting binary size (see graph). But this is not the test, of course BR Peter Attachments:
|
|