|
Post by bigbass on Mar 14, 2023 18:50:06 GMT 1
Hello Peter I am very happy with your string code improvements as isand I am thankful for all your improvements over the years I have no complaints all is well and good ! bacon is very fast it is not normal or everyday we need a few million loops ---------------------------------------------------- how I get around some of the problems with strings in general on the slow RPI3 is when using only pure bacon code I parsed a json file from chrome (normally that would be done in javascript) (but wasn't sure how to do it in javascript at that time) I later ported it to javascript PROBLEMS when we have to parse strings just "one workaround" there are many we could use 1.) sometimes we have dynamic data the file size and data is unknown or changing so we can't hard code anything so we LOAD the data into a string txt$2.) we SPLIT the string into an array$[] we think about this as a long stream (like sed or javascript parsing html) REGEX works well for filters we handle leftside$ of the equal sign and rightside$ of the equal sign differently during the process when done then we CONCAT$ them back together and write it to a file .done this works really fast in bacon time ./blinks /tmp/Bookmarks.html created! real 0m0.012s user 0m0.011s sys 0m0.001s and 100% pure bacon code Joe here is the full demo an old demo blinks.bac bookmarks 2 links '---version 1.2 Jan 17 2021 '---chrome JSON 2 HTML bookmarks by bigbass '---automatically converts google chrome JSON formatted bookmarks to cleaned html bookmarks
'---make the bookmarks header format manually
OPEN "/tmp" & "/Bookmarks.html" FOR WRITING AS myfile
'---header WRITELN "<!DOCTYPE NETSCAPE-Bookmark-file-1<!-- This is manually \ generated file.you can use your chrome Bookmarks.html and over write \ this or make your personal list by editing \ this simpilfied version by bigbass --><META HTTP-EQUIV=\"Content-Type\" \ CONTENT=\"text/html; charset=UTF-8\"><TITLE>Bookmarks</TITLE> \ <H1>Bookmarks</H1><DL><p><DL><p>" TO myfile
'---doublequote shortcut Q$ ="\""
bookmarks$ = GETENVIRON$("HOME") & "/.config/chromium/Default/Bookmarks" txt$ = LOAD$(bookmarks$)
SPLIT txt$ BY "," TO array$ SIZE dimension FOR i = 0 TO dimension -1
array$[i] = REPLACE$(array$[i], "{" , "") array$[i] = REPLACE$(array$[i], "}" , "") array$[i] = REPLACE$(array$[i], "[" , "") array$[i] = REPLACE$(array$[i], "]" , "") '--- add javascript support for booklets array$[i] = REPLACE$(array$[i], "javascript=" , "javascript:") '---remove double quotes both sides array$[i] = REPLACE$(array$[i], "\"" , "") IF REGEX(array$[i], "name:") THEN '--- add double quotes right side only rightside$ = REPLACE$(array$[i], "name:" , ">") END IF IF REGEX(array$[i], "url:") THEN '--- add double quotes right side only leftside$ = REPLACE$(array$[i], "url:" , "<DT><A HREF=\"") 'PRINT CHOP$(leftside$) & Q$ ,CHOP$(rightside$) & "</A>" leftside$ = CHOP$(leftside$) rightside$ = CHOP$(rightside$) b2$ = CONCAT$(leftside$ ,Q$ ,rightside$ ,"</A>") WRITELN b2$ TO myfile END IF NEXT CLOSE FILE myfile PRINT "/tmp/Bookmarks.html created! "
|
|
|
Post by Pjot on Mar 15, 2023 18:12:52 GMT 1
Thanks Joe, I am happy to see the nice numbers in your time measurement And you're right, a million loops are ridiculous, but I have one more idea which I will look at. If this gets too complicated too, then I'll give up Regards Peter
|
|
|
Post by bigbass on Mar 17, 2023 18:32:07 GMT 1
Hello Peter well I wanted to see a minimum way to concat a string but use memory this time note : if you overload memory things blow up so lets keep it simple I changed your benchmark demo my results show bacon is still very fast! so no problems or any reason to worry Joe OPTION PARSE FALSE
DECLARE str TYPE STRING DECLARE i TYPE int
str = (char *)malloc(sizeof(char) * 100000) strcpy(str, "") FOR i=0 TO 100000 strcat(str, "a") NEXT 'PRINT str FORMAT "%s\n"
I wasn't sure how to work that out in 100% bacon so I did it in c first then ported it #include <stdio.h> #include <stdlib.h> #include <string.h>
int main() { char *str = (char *)malloc(sizeof(char) * 100000); strcpy(str, ""); for (int i = 0; i < 100000; i++) { strcat(str, "a"); } //printf("%s\n", str); }
|
|
|
Post by bigbass on Mar 18, 2023 19:31:22 GMT 1
Hello Peter had some time to play with this and thought to use online compilers to be sure all the code demos will work for everyone and no need to set up and install anything you might not want to have some of these extra dependencies gotchas : be careful with index when porting code c code I gave up on maybe you could add that to the list the memory part kept seg faulting on me in C of course it is a lot slower testing on line but its a good tool to have online compilers while testing ideas starting with BaConDECLARE a TYPE STRING DECLARE b TYPE STRING DECLARE index1 TYPE int DECLARE index2 TYPE int
a$ = "Hello" FOR index1 = 1 TO 100000 a$ = a$ & "benchmark" NEXT PRINT "length a: ", LEN(a$)
b$ = "Hello" FOR index2 = 1 TO 100000 b$ = "benchmark" & b$ NEXT PRINT "length b: ",LEN(b$) ============================================== JAVASCRIPT port you just copy and paste it www.programiz.com/javascript/online-compiler/var a = "Hello"; for (var index1 = 0; index1<100000; index1++) { a = a + "benchmark"; } console.log("length a:", a.length);
var b = "Hello"; for (var index2 = 0; index2<100000; index2++) { b = "benchmark" + b; } console.log("length b:", b.length);
===================================================== C++ portwww.programiz.com/cpp-programming/online-compiler/#include <iostream> #include <string> #include <cstdlib>
int main() {
std::string a = "Hello"; for (int index1 = 0; index1 < 100000; index1++) { a = a + "benchmark"; } std::cout << "length a: " << a.length() << std::endl;
std::string b = "Hello"; for (int index2 = 0; index2 < 100000; index2++) { b = "benchmark" + b; } std::cout << "length b: " << b.length() << std::endl; }
===================================================== GO portwww.programiz.com/golang/online-compiler/package main
import ( "fmt" )
func main() { a := "Hello" for i := 0; i < 100000; i++ { a = a + "benchmark" } fmt.Println("length a: ", len(a))
b := "Hello" for i := 0; i < 100000; i++ { b = "benchmark" + b } fmt.Println("length b: ", len(b)) }
===================================================== Python Portwww.programiz.com/python-programming/online-compiler/a = "Hello" for index1 in range(100000): a = a + "benchmark" print("length a:", len(a))
b = "Hello" for index2 in range(100000): b = "benchmark" + b print("length b:", len(b))
|
|
|
Post by bigbass on Mar 21, 2023 5:34:49 GMT 1
Hello Peter The raspberry pi3 is heavily stressed at one hundred thousand loopsand gave me some bad results such as seg faults memory overload gave me poor resultsI reduced the loops to ten thousand loops for all benchmarks and added one more language to the test Rustand there is some really good news now for bacon! I compiled and ran all on the RPI3 locally here are the new results in order of winners so that the test will be working on low RAM systems (RPI3) benchmark4.tar.gz (1.09 KB) Bacon time ./bacon4 length a: 90005 length b: 90005 real 0m0.194s user 0m0.183s sys 0m0.011s ===================== rustc rust4.rs time ./rust4 length a: 90005 length b: 90005 real 0m0.312s user 0m0.281s sys 0m0.030s ========================== time python ./python4.py length a: 90005 length b: 90005 real 0m0.404s user 0m0.326s sys 0m0.079s =================== time ./cplus4 length a: 90005 length b: 90005 real 0m0.560s user 0m0.396s sys 0m0.164s ==================== time node ./javascript4.js length a: 90005 length b: 90005 real 0m0.625s
|
|
|
Post by Pjot on Mar 21, 2023 18:21:48 GMT 1
Hi Joe, Thanks for your posts, I did read them, but I was too busy implementing a new algorithm for incremental string concatenation. So let's start with some good news: the new algorithm now also is faster than NodeJS First of all, I noticed that BaCon 4.6.1 already was faster than NodeJS when concatenating the type "a$ = a$ + b$". So the problem was with the type "a$ = b$ + a$". And as mentioned before, interestingly, almost all languages are slower when they perform this last type of concatenation. So, first I designed an algorithm which used a linked-list structure, splitting large strings into small pieces. Though the lab model performed very well, it soon became clear that this would also involve a complete redesign of BaCon, which would be just too much. Then I invented a new algorithm which was compatible with the current data model for strings. In a nutshell, it has a RAM area on both sides of the string, letting the string 'float' somewhere in the middle, so to speak. So concatenations of the first type add text after the string, and the second type add text before it. This model of 'floating strings' came with some severe penalties and nasty side effects though. Particularly, there was one memory leak which literally took more than two full days to fix, one of the hardest problems I ever ran into. Anyway, it is done now. These are the results when using a concatenation of 100.000 times: NodeJS: $ time nodejs ./bench3.js length a: 900005 length b: 900005
real 0m0,062s user 0m0,059s sys 0m0,004s
BaCon 4.6.2: $ time ./bench3 Length: 900005 Length: 900005
real 0m0,013s user 0m0,013s sys 0m0,000s
It's a bit hard to see, right? Let's change the value to 10.000.000: NodeJS: $ time nodejs ./bench3.js length a: 90000005 length b: 90000005
real 0m1,569s user 0m1,514s sys 0m0,155s
BaCon 4.6.2: $ time ./bench3 Length: 90000005 Length: 90000005
real 0m0,913s user 0m0,813s sys 0m0,100s
Note that the performance gets better when using optimization flags in GCC like '-O2'. BTW, the current release BaCon 4.6.1 takes extremely long when using 10.000.000 as a value (I had to <ctrl>+<c> myself out of the loop). But the new algorithm can handle 10.000.000 very well. Now a full run of all tests, with the value back to 100.000: BaCon: 0.01 Javascript: 0.05 <--- NodeJS Perl: 0.62 FreePASCAL: 1.23 FreeBASIC: 6.86 Python: 7.39 AWK: 7.59 newLisp: 7.83 TCL: 10.75 PHP: 11.37 Kornshell: 17.86 ZSH: 98.38
In the upcoming days, I will apply more tweaks and code improvements to straighten out some wrinkles here and there. And I hope you'll not come up with some other interpreter even faster than this. But if you found one do not hesitate to let me know. Best regards Peter PS latest code can be found in the fossil repo.
|
|
|
Post by bigbass on Mar 22, 2023 1:53:22 GMT 1
Hello Peter
You did it .It's blazing fast! yes it was hard work but in the end you got the best time in the 1/4 mile track
it is almost instantaneous just click and see the results
very thankful
and glad you gave it another try in the c part of the code ( for me ...I crashed and burned in C ) but on a positive note had fun porting to some other languages and running tests
Joe
|
|
|
Post by Pjot on Mar 22, 2023 7:12:12 GMT 1
Thanks Joe, Good you mention your ports. I took a look at your programs for C++, Go and Rust. Curiously, they all have problems with the "a$ = b$ + a$" type of concatenation. By the way, the "a$ = b$ + a$" is just a generic model for incremental concatenations. It also means "a$ = b$ + c$ + a$", "a$ = b$ + a$ + c$" and so on, as long as the assigned variable does not occur as a first in the assignment. I have installed all those compilers on my system and ran the test again, after setting the loop in all programs to 100.000 iterations: BaCon: 0.01 Javascript: 0.06 Perl: 0.58 FreePASCAL: 1.20 FreeBASIC: 6.79 Rust: 7.47 Python: 7.50 newLisp: 7.75 AWK: 7.93 TCL: 10.60 C++: 10.99 PHP: 13.17 Go: 17.03 Kornshell: 17.95 ZSH: 94.65
So the only real competitor seems to be Javascript. Attached the full package including your test programs. BR Peter Attachments:benchmark.tar.gz (2.46 KB)
|
|
|
Post by bigbass on Mar 25, 2023 17:17:06 GMT 1
Hello Peter went back to take another go at it in C code (even though I know you win with the bacon code ) but its a personal thing I need to have a stand alone c code demo to complete the test) giving up is not an option .I just look for a solution to fix the problem then give it another try at some later time when I am ready to take it on again and get out of the mental fog there is still a mysterious problem with memory if I increase the loop to 100,000 it will seg fault on the rpi3 "the original problem " all that said here is at least a stand alone working c code exampe with malloc that does work correctly until 10,000 loops the time is a little faster than c++ and no macros or hacks added this is as far as I can get with it for today and you can see why I stopped the first time with seg faults cant put my finger on it why it blows up at one hundred thousand loops seems like it can't allocate memory for it or the pointer is limited in size? but this code does work on the RPI3! P.S thanks for your new benchmarks .tar.gz #include <stdio.h> #include <stdlib.h> #include <string.h>
int main() { char* a = (char*)malloc(sizeof(char)*6); strcpy(a, "Hello"); for (int index1 = 0; index1 < 10000; index1++) { a = (char*)realloc(a, strlen(a) + strlen("benchmark") + 1); strcat(a, "benchmark"); } printf("length a: %d\n", strlen(a));
char* b = (char*)malloc(sizeof(char)*6); strcpy(b, "Hello"); for (int index2 = 0; index2 < 10000; index2++) { char* temp = (char*)malloc(sizeof(char)*(strlen(b) + strlen("benchmark") + 1)); strcpy(temp, "benchmark"); strcat(temp, b); free(b); b = temp; } printf("length b: %d\n", strlen(b));
free(a); free(b);
return 0; }
|
|
|
Post by Pjot on Mar 26, 2023 8:15:38 GMT 1
Thanks Joe,
Looks good to me!
BTW I noticed that compiling BaCon on the "Raspberry Pi 4 Model B Rev 1.5" gets flaky when using the current Raspbian GCC 10.2.1.
But it will work when using "clang":
CXXFLAGS="-x c++" CC=clang CXX=clang++ ./configure --enable-gui-tk
After this, compilation and usage of BaCon works fine on my RPI 4.
BR Peter
|
|
|
Post by Pjot on Mar 26, 2023 18:15:28 GMT 1
Added Java, Lua and Ruby. The languages S-Lang, ScriptBasic, ZSH, BASH and M4 are way too slow. I have left them out of the list. Also, for fun, I tried a loop iteration of 100.000.000. With this number, BaCon takes approx. 8.7 seconds on my system to complete the concatenations, but NodeJS crashes. Other languages are too slow for that amount (or they crash also). So these are the latest results (all set to 100.000 iterations): BaCon: 0.01 Javascript: 0.06 Perl: 0.63 FreePASCAL: 1.20 Go: 4.83 FreeBASIC: 6.79 Python: 7.40 AWK: 7.44 Rust: 7.47 newLisp: 7.84 Ruby: 7.84 Java: 10.49 Lua: 10.68 TCL: 10.69 C++: 11.22 PHP: 11.63 Kornshell: 17.78
Attached the package with all the code samples, including those I have left out of the above list. BR Peter Attachments:benchmark.tar.gz (2.98 KB)
|
|
|
Post by bigbass on Mar 28, 2023 6:44:10 GMT 1
Hello Peter
well I solved the problem that is to not use the RPI3 32 bit OS I used Manjaro 64 bit for the RPI3 (xfce) and now the code works so the root cause is 32 bit memory allocation runs out then seg faults but on a 64 bit OS all is fine but still slow
the c code demo is good only a minor adjustment is needed for 64 bit in the print statement
FROM 32 bit
printf("length a: %d\n", strlen(a)); printf("length a: %d\n", strlen(a));
TO 64 BIT
printf("length a: %ld\n", strlen(a)); printf("length a: %ld\n", strlen(a));
P.S I use clang++ because the error messages are easier to debug (easier to understand)
Thanks for .
switched over to Manjaro I still have to install a few compilers on manjaro before I can run all the benchmarks again but working on it
Joe
|
|
|
Post by bigbass on Mar 28, 2023 18:37:36 GMT 1
Hello Peter just like boxing we have different classes heavy weight can't fight against a feather weight ( well they can but we know who wins) so the rule applies for the raspberry pi3 compared to a 64 bit OS with more RAM or some other such as the rpi4 with 8 GB of RAM how we make the test more rounded is to see what happens on a low ram systemand doing the loops with 10,000 all on the same system with no optimizing options added to the compiler special note this is just one test it is in no way complete and I was surprised with the results I must be using a newer version of rust on Manjaro compared to Debian Bullseye because now rust is faster than I would have expected it to be note that I still would pick bacon over rust for coding anything any day and if I was doing anything with just a focus on web pages I would use javascript millage may vary but I attached the test files benchmark4.tar.bz2 (1.14 KB) rustc -Vrustc 1.67.1 (d5a82bbd2 2023-02-07) (Arch Linux rust 1:1.67.1-1) RAM is a big factor in the benchmarks results rustc rust4.rs time ./rust4 length a: 90005 length b: 90005 real 0m0.250s user 0m0.163s sys 0m0.078s -------------------- time ./bacon4 length a: 90005 length b: 90005 real 0m0.033s user 0m0.024s sys 0m0.009s ---------------------------- time ./cplus4 length a: 90005 length b: 90005 real 0m0.624s user 0m0.436s sys 0m0.174s --------------------------- time python3 python4.py length a: 90005 length b: 90005 real 0m0.672s user 0m0.362s sys 0m0.127s -------------------- time ./c_malloc length a: 90005 length b: 90005 real 0m1.112s user 0m1.038s sys 0m0.068s -------------------- go build go4.go time ./go4 length a: 90005 length b: 90005 real 0m2.128s user 0m2.421s sys 0m0.597s ------------------------ time node ./javascript4.js length a: 90005 length b: 90005 real 0m3.286s user 0m1.073s sys 0m0.323s --------------------
|
|
|
Post by Pjot on Mar 28, 2023 19:30:23 GMT 1
Thanks Joe,
That's interesting, now NodeJS is the slowest of all. Weird?
I agree these benchmark programs are not exact science, results may vary per architecture and operating system and hardware.
Also, an interpreter works different compared to a compiler, which, in its turn, differs from a source code converter like BaCon. So what if BaCon would have generated Go code, or Rust code? On the other hand, we can ask: why are the native string processing functions in other languages so slow?
Next to that, I did receive remarks like "concatenation of strings can be faster in this language when you do it such-and-so", but then we can do the same thing in BaCon just as well.
Nevertheless, the whole comparison exercise should give an impression, and its interesting to compare the results IMHO.
BR Peter
|
|
|
Post by alexfish on Mar 28, 2023 20:09:39 GMT 1
Hi All
GO looks ok C++ ?
BR Alex
|
|