|
Post by alexfish on Jan 17, 2015 14:58:29 GMT 1
Hi Vovchik
think U got mixed methods old and new not same athough not sure about the iterations to file len?
although do know Bacon can concat more bits via the va args that is not a fair test for a straight iteration from a file
so for the above results
did this
OPTION PARSE FALSE PRAGMA INCLUDE sds.h sds.c DECLARE newstring$ TYPE STRING
' ------------------ FUNCTION CAT$(STRING FILENAME$)
' ------------------ LOCAL mystring TYPE sds mystring = sdsnew("") st$ = "" LOCAL fileline$, txt$ TYPE STRING IF FILEEXISTS(FILENAME$) THEN OPEN FILENAME$ FOR READING AS catfile WHILE NOT(ENDFILE(catfile)) DO READLN fileline$ FROM catfile REM txt$ = SDSCAT$(txt$, fileline$, NL$) mystring =sdscat(mystring, fileline$) mystring =sdscat(mystring, NL$) WEND CLOSE FILE catfile END IF st$ = (char*)mystring sdsclear(mystring) RETURN st$ END FUNCTION
FOR i = 1 TO 2000 x$ = CAT$("testhelp.h") NEXT i prog$ = (char*)argv[0] PRINT prog$, " - done with ", i - 1, " iterations" PRINT "Filelen: ", LEN(x$)
the old
' ------------------ FUNCTION CAT$(STRING FILENAME$) ' ------------------ LOCAL fileline$, txt$ TYPE STRING IF FILEEXISTS(FILENAME$) THEN OPEN FILENAME$ FOR READING AS catfile WHILE NOT(ENDFILE(catfile)) DO READLN fileline$ FROM catfile txt$ = CONCAT$(txt$, fileline$, NL$) WEND CLOSE FILE catfile END IF RETURN CHOP$(txt$) END FUNCTION
FOR i = 1 TO 2000 x$ = CAT$("testhelp.h") NEXT i prog$ = (char*)argv[0] PRINT prog$, " done with ", i - 1, " iterations" PRINT "Filelen: ", LEN(x$)
Have Fun + BR Alex
Added:: did a similar test with lib eina but sure if totally fair , used BaCon TIMER , reason
Eina uses own memory pool at init sequence , on shutdown it cleans up its mess, so that is were the timer was used
IE
eina_init() ST=TIMER FOR i = 1 TO 2000 x$ = CAT$("testhelp.h") NEXT i prog$ = (char*)argv[0] PRINT prog$, " - done with ", i - 1, " iterations" PRINT "Filelen: ", LEN(x$) eina_strbuf_free(buf) PRINT " internal Timer :" ,TIMER-ST eina_shutdown()
Result :: also noticed SDS & EINA have same size buffer , so method look about the same.
Also to be fair I think BaCon CONCAT is not bad in terms of speed by its method , CONCAT(myconcat,__VA_ARGS__)
but adding something like SDS can give it the edge when concatenating a file line by line.
|
|
|
Post by vovchik on Jan 17, 2015 15:42:25 GMT 1
Dear Alex, Yep - and I think you picked up the archive before I made some changes - which are like yours, I think. So the conclusion is that we get a substantial increase in speed, which means that we should probably integrate the sdscat routine somehow. I am certain Peter will have an excellent idea regarding implementation. With kind regards, vovchik
|
|
|
Post by vovchik on Jan 18, 2015 11:27:48 GMT 1
Dear Peter and Alex, Here are some more tests. You will see that SDS performs the job in 0m0.003s, whereas standard BaCon's best time is 0m1.055s on my machine. That is a huge difference. I am very impressed with SDS concat, to say the least. With kind regards, vovchik Attachments:more_cat.tar.gz (9.76 KB)
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Jan 18, 2015 20:44:41 GMT 1
Can SDS be considered a standalone string engine and garbage collector?
|
|
|
Post by Pjot on Jan 19, 2015 14:15:20 GMT 1
Thanks vovchik and Alex,
To me it is also clear that we have to go for the SDS implementation. I was thinking to use the existing 'STRING' declaration type to declare a strng as 'sds'-aware.
Not sure what to do with the string functions - I gues a prefix like 'PCONCAT$', 'PLEFT$', 'PRIGHT$' etc woould work?
Instead, another approach would be using an option like 'OPTION POWERSTRING' or something like that, which will switch the existing functions to sds-aware functions.
Or: use an additional suffix to variable names to indicate the the sds-type. So for string '$' and sds '@' or so. (Not everybody is in favor of this...)
For SDS itself, it can be added as an unmodified objectfile to the 'libbacon.a' library created at installation time. This way its license will not be broken.
@ jrs: sds does not contain a garbage collector, but it has its own 'sdsfree' call which has to be used at the appropriate places in the C source code. This is something BaCon will do for you (as it does now with regular strings).
BR Peter
|
|
|
Post by vovchik on Jan 19, 2015 14:46:01 GMT 1
Dar Peter, I also think SDS is great. It could be "filled out", so to speak by a few extra C functions, which I have in the attached archive, just as an example. The OPTIONS option is appealing and the P prefix is OK, too. I am glad you like the lib and am certain it will make string handling in BaCon exceptionally fast. With kind regards, vovchik Attachments:tests_c.tar.gz (2.54 KB)
|
|
|
Post by dave99 on Jan 19, 2015 18:47:19 GMT 1
Hi all
I have been following this thread with some interest and herewith my 2 cents worth:
How about adding "SDS_" to the beginning of the string? For Example:
SDS_TheStringName
I'm also not a fan of adding "@" to strings. Which ever way you choose Peter, please don't break compatibility with older version of BaCon.
Regards. Dave.
|
|
|
Post by dave99 on Jan 19, 2015 18:49:25 GMT 1
Sorry forgot to ask, if you implement SDS, will it also be available in the bacon.bash version?
|
|
|
Post by Pjot on Jan 20, 2015 7:45:33 GMT 1
Hi dave99,
Of course BaCon should be backwards compatible. Also, the SDS functionality will be available in all versions including the shell implementation of BaCon.
BR Peter
|
|
|
Post by vovchik on Jan 20, 2015 10:18:57 GMT 1
Dear dave99, We old BaCon hands normally use the shell version to "bootstrap" a new compiled bacon, but why would you really want to compile anything but binary bacon using bacon.sh (I know you hacked bacon.sh to find io.h in Proteus and it works ? But, if you can compile something (anything) using bacon.sh, it should mean that you have a working gcc and can compile bacon.bac using bacon.sh. And the compiling speeds between bacon.sh and bacon are staggeringly different: bacon.sh test.bac Converting 'test.bac'... done, 102 lines were processed in 9 seconds. Compiling 'test.bac'... cc -c test.bac.c cc -o test test.bac.o -lbacon -lm -ldl Done, program 'test' ready.
real 0m9.290s user 0m7.256s sys 0m0.556s
bacon test.bac Converting 'test.bac'... done, 102 lines were processed in 0.079 seconds. Compiling 'test.bac'... cc -c test.bac.c cc -o test test.bac.o -lbacon -lm -ldl Done, program 'test' ready.
real 0m0.619s user 0m0.504s sys 0m0.068s
This is for a 100-line string loop test. Compiled bacon does it in little over half a second on my slow machine, and the shell version takes nearly 10 seconds (roughly 20x faster using compiled bacon). The binaries are the same, so you only incur a time penalty using bacon.sh for anything but compiling bacon.bac. In any case, it is great that you solved the problem you encountered with Proteus. With kind regards, vovchik
|
|
|
Post by dave99 on Jan 21, 2015 9:47:54 GMT 1
Peter,
Thank you.
Vovchik,
I agree with you that using the bash version is very, very slow but I had problems compiling BaCon so for now just reverted back to using the bash one. Towards the end of the week should have more time and will tackle compiling it again.
Regards. Dave.
|
|
|
Post by alexfish on Jan 22, 2015 13:32:23 GMT 1
Hi Peter
looking at baCon concating methods , could something like the below work The String = the power string
String my_str ="opop"
StringCat " more data" TO my_str StringCat " And even more data" TO my_str
StringCat " more data" & " And even more data" TO my_str
Or
String my_str="opop" StringCat my_str " more data" StringCat my_str " more data" & " And even more data"
Alex
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Jan 25, 2015 6:26:30 GMT 1
Before I stray too far from the farm, I want to give Peter Verhas's MyAlloc memory manager a try. Being thread safe and mature and a key component in Script BASIC, makes me want to dig a little deeper.
I'm still very interested in what you guys are doing with sds and BaCon.
|
|
|
Post by Pjot on Jan 25, 2015 11:16:06 GMT 1
Hi vovchik, Thanks for your previous demonstration programs. The C code is quite clear. But the main problem with your functions is that they suffer from major memory leaks ...the string functions perform a 'malloc' to store a result, but this memory never gets freed. The current BaCon string implementation provides the following features: - No memory leaks when returning string results or when used in SUB or FUNCTION
- Allows string functions called within string functions (nesting) without memory leaks
- Allows variadic argument lists (e.g. CONCAT or '&') accepting both char* and plain "text" as input
- No dependency on external garbage collectors or memory managers so all works on exotic platforms
- Plain POSIX compliant C implementation and no architecture specific assembly code
The way this is done is by using an internal static array of strings and a separate array which keeps track of the allocated memory for each array member. Strings results from string functions are stored in this internal array, after which the next member is used to store a next result. Plain BaCon string variables copy their contents from this array. Though the requirements provide a very compatible and versatile C string implementation, this comes with a cost in terms of performance. Because of all the looping through the internal result array, allocating memory for elements where necessary (only growing, not diminishing), and also because of the the copying to normal string variables, the performance goes down. Still it is not bad (it is the best of any BASIC I have tested) but it is not as good as we would like to see. So the last weeks I have been studying the Better String Library and the Simple Dynamic Strings implementation to see if there is a way how to use these libraries within BaCon. I actually have tried to use them as a replacement for the current plain C strings offered by BaCon. This however turns out to be impossible - unless we give up one of the requirements mentioned above. Especially a function like 'CONCAT', which should accept a variadic amount of arguments of both char* and also plain "text" format, is impossible to implement using 'bcatcstr' or 'sdscat'. (Note that BaCon uses a plain 'memcpy' concatenate strings, not 'strcat'.) Looking closer, it will be clear that a function like 'sdscat' from SDS only can be used in certain circumstances, namely, only when an SDS type of string should have a plain "text" added to it (similar for Better String Lib). But it is impossible to see whether an incoming string from a variadic list of arguments is a plain "text", a char* or an SDS string. We have to know already in advance what type of arguments will be sent to a concatenation in order to used 'sdscat'. But in real life we do not know. Therefore, the only way would be using any of these libraries outside BaCon, as a wrapper or interface, to be used in specific situations for which specific functions can be used. A simple concatenation of a character to a string works out of the box with 'sdscat'. But this is a very specific situation. What if we would like to add 5 or 10 mixed type of strings on the same line? It can be done in BaCon, but this cannot be done the same way by the aforementioned libraries - we have to write additional looping code to make it work. So I guess we end up in designing a wrapper for SDS, this is the best which can be done right now if we want to maintain the string features of BaCon as they are now. BR Peter
|
|
|
Post by vovchik on Jan 25, 2015 13:00:11 GMT 1
Dear Peter, I just stuck in those c bits as a departure point- for further SDSification. Yep, massive memory leaks but just there as food for thought for implementing analogous things in SDS and nearly parallel to the normal BaCon string functions, which are great. I have a few more, but was just thinking out loud there. You are doubless right that SDSing all strings would have some undesired effects, so a specail SDS lib and analogous functions with an SDS or some other prefix might be the way to go. I am certain your solution will be the wisest and most efficient in the end.... With kind regards, vovchik
|
|