|
Post by alexfish on Jul 8, 2017 15:04:53 GMT 1
Hi Vovchik can test this BR Alex ' Include the CURL context INCLUDE "curl.bac" 'https://www.basic-converter.org/ ' We store our data in this variable DECLARE mail$ DECLARE MAIL$=""
' Callback to let Curl gather data FUNCTION Save_Data(char* buffer, size_t size, size_t nmemb, void *userp)
mail$ = CHOP$(buffer) MAIL$ = MAIL$ & mail$ 'mail$ = LEFT$(mail$,LEN(mail$)-1) RETURN size*nmemb
END FUNCTION
' Clean end of html SUB Clean() IF RIGHT$(MAIL$,1) = "0" THEN MAIL$ = LEFT$(MAIL$,LEN(MAIL$)-1) END IF
END SUB
SUB Search() ' Create the handle curl_global_init(CURL_GLOBAL_ALL) easyhandle = curl_easy_init()
' Set the options curl_easy_setopt(easyhandle, CURLOPT_URL, "https://www.basic-converter.org/") curl_easy_setopt(easyhandle, CURLOPT_SSL_VERIFYHOST, 0L);
curl_easy_setopt(easyhandle, CURLOPT_WRITEFUNCTION, Save_Data)
' Perform the GET success = curl_easy_perform(easyhandle)
' Cleanup curl_easy_cleanup(easyhandle) CALL Clean END SUB
CALL Search()
PRINT MAIL$
One should really first check and examine . then run options// in part look at curl.haxx.se/libcurl/c/https.html
|
|
|
Post by vovchik on Jul 8, 2017 17:08:25 GMT 1
Dear Alex,
This bit:
curl_easy_setopt(easyhandle, CURLOPT_SSL_VERIFYHOST, 0L)
helped (good work), and some of the URLs are now perfectly normal. Others still render as file:///. There must be an additional parameter to fix those automatically via curl calls, I should think.
With kind regards, vovchik
|
|
|
Post by alexfish on Jul 8, 2017 17:32:15 GMT 1
Hi Vovchik Still looking, but also testing the links as in using HUG if have a default browser then this works try clicking the label / hover over label / right click on label INCLUDE "hug.bac"
win = WINDOW( "Link",300,200)
link1 = MARK("<a href='http://www.basic-converter.org/'>BaCon</a>",-1,-1)
ATTACH(win,link1,5,5)
DISPLAY Actual code in archive BR Alex Attachments:Links.bac.bz2 (181 B)
|
|
|
Post by alexfish on Jul 8, 2017 18:14:22 GMT 1
|
|
|
Post by alexfish on Jul 8, 2017 18:36:06 GMT 1
Hi Vovchik
also appended "\n" to the concat for now
' Include the CURL context INCLUDE "curl.bac" 'https://www.basic-converter.org/ ' We store our data in this variable DECLARE mail$ DECLARE MAIL$=""
' Callback to let Curl gather data FUNCTION Save_Data(char* buffer, size_t size, size_t nmemb, void *userp)
mail$ = CHOP$(buffer) MAIL$ = MAIL$ & mail$ & "\n" 'mail$ = LEFT$(mail$,LEN(mail$)-1) RETURN size*nmemb
END FUNCTION
' Clean end of html SUB Clean() IF RIGHT$(MAIL$,2) = "0\n" THEN MAIL$ = LEFT$(MAIL$,LEN(MAIL$)-2) END IF
END SUB
SUB Search() ' Create the handle curl_global_init(CURL_GLOBAL_ALL) easyhandle = curl_easy_init()
' Set the options "https://duckduckgo.com/html/?q=BaconConverter" "https://www.basic-converter.org/" curl_easy_setopt(easyhandle, CURLOPT_URL, "https://duckduckgo.com/html/?q=BaconBasicConverter") curl_easy_setopt(easyhandle, CURLOPT_SSL_VERIFYHOST, 0L);
curl_easy_setopt(easyhandle, CURLOPT_WRITEFUNCTION, Save_Data)
' Perform the GET success = curl_easy_perform(easyhandle)
' Cleanup curl_easy_cleanup(easyhandle) CALL Clean END SUB
CALL Search()
PRINT MAIL$
|
|
|
Post by alexfish on Jul 8, 2017 19:58:54 GMT 1
Hi Vovchik Just found out after a bit of tweaking , can make a web page from ONE Label. Hint gtk_label_set_single_line_mode(label,0) set the xaline to 0 enable copy text with gtk_label_set_selectable true can stop the Auto connect with CALLBACK(widget,"activate-link",My_function,...) Example: SUB My_function(long id , STRING url$) PRINT " Link Url >" << url$
END SUB
can compose the label like so "<a href='http://www.basic-converter.org/'>BaCon basic Converter</a> \nRest Of text goes here with all the triming here\n and further Bits of Blagh Blagh\n <a href='http://www.basic-converter.org/'>BaCon AT</a>"
BR Alex Demo Mock Up , one window one Label Attachments:
|
|
|
Post by vovchik on Jul 8, 2017 21:11:28 GMT 1
Dear Alex,
I found this bit, which might be useful:
/*************************************************************************** * _ _ ____ _ * Project ___| | | | _ \| | * / __| | | | |_) | | * | (__| |_| | _ <| |___ * \___|\___/|_| \_\_____| * * Copyright (C) 1998 - 2015, Daniel Stenberg, <daniel@haxx.se>, et al. * * This software is licensed as described in the file COPYING, which * you should have received as part of this distribution. The terms * are also available at https://curl.haxx.se/docs/copyright.html. * * You may opt to use, copy, modify, merge, publish, distribute and/or sell * copies of the Software, and permit persons to whom the Software is * furnished to do so, under the terms of the COPYING file. * * This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY * KIND, either express or implied. * ***************************************************************************/ /* <DESC> * Shows how the write callback function can be used to download data into a * chunk of memory instead of storing it in a file. * </DESC> */
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <curl/curl.h>
struct MemoryStruct { char *memory; size_t size; };
static size_t WriteMemoryCallback(void *contents, size_t size, size_t nmemb, void *userp) { size_t realsize = size * nmemb; struct MemoryStruct *mem = (struct MemoryStruct *)userp;
mem->memory = realloc(mem->memory, mem->size + realsize + 1); if(mem->memory == NULL) { /* out of memory! */ printf("not enough memory (realloc returned NULL)\n"); return 0; } memcpy(&(mem->memory[mem->size]), contents, realsize); mem->size += realsize; mem->memory[mem->size] = 0; return realsize; }
int main(void) { CURL *curl_handle; CURLcode res; struct MemoryStruct chunk; chunk.memory = malloc(1); /* will be grown as needed by the realloc above */ chunk.size = 0; /* no data at this point */ curl_global_init(CURL_GLOBAL_ALL); /* init the curl session */ curl_handle = curl_easy_init();
/* specify URL to get */ curl_easy_setopt(curl_handle, CURLOPT_URL, "http://www.example.com/"); /* send all data to this function */ curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION, WriteMemoryCallback); /* we pass our 'chunk' struct to the callback function */ curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&chunk); /* some servers don't like requests that are made without a user-agent field, so we provide one */ curl_easy_setopt(curl_handle, CURLOPT_USERAGENT, "libcurl-agent/1.0"); /* get it! */ res = curl_easy_perform(curl_handle); /* check for errors */ if(res != CURLE_OK) { fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(res)); } else { /* * Now, our chunk.memory points to a memory block that is chunk.size * bytes big and contains the remote file. * * Do something nice with it! */ printf("%lu bytes retrieved\n", (long)chunk.size); } /* cleanup curl stuff */ curl_easy_cleanup(curl_handle); free(chunk.memory); /* we're done with libcurl, so clean it up */ curl_global_cleanup();
return 0; }
With kind regards, vovchik
|
|
|
Post by alexfish on Jul 8, 2017 21:43:32 GMT 1
Hi Vovchik
mem->memory = realloc(mem->memory, mem->size + realsize + 1);
That be the bits I look at when trying to resolve the problem with BaCon String regards History on that one RE the Fonts for the BaCon CanVas --> STB lib and hence ::
[code]' Callback to let Curl gather data FUNCTION Save_Data(char* buffer, size_t size, size_t nmemb, void *userp)
mail$ = CHOP$(buffer) MAIL$ = MAIL$ & mail$ & "\n" 'mail$ = LEFT$(mail$,LEN(mail$)-1) RETURN size*nmemb
END FUNCTION [/code]
not sure if this be part of the global Init feature curl_global_init(CURL_GLOBAL_ALL) .. but may be worth trying
Here at least with the updated code above every thing is working . expeption need to try loading other site , + test search for in word WWW. as in http also can run parser to ensure "\n" if not where it should be, here I do not need libtidy
Will start on these bits Tomorrow .
in mean time , can report if header is missing or the "</body></html>" IE Are all file Complete. on not
BR Alex
|
|
|
Post by alexfish on Jul 9, 2017 4:02:55 GMT 1
|
|
|
Post by alexfish on Jul 9, 2017 10:36:35 GMT 1
Hi Vovchik & all
in the cURL example there is a gtk demo worth looking
BR Alex
/***************************************************************************** * _ _ ____ _ * Project ___| | | | _ \| | * / __| | | | |_) | | * | (__| |_| | _ <| |___ * \___|\___/|_| \_\_____| * * Copyright (c) 2000 David Odin (aka DindinX) for MandrakeSoft */ /* <DESC> * use the libcurl in a gtk-threaded application * </DESC> */ #include <stdio.h> #include <gtk/gtk.h> #include <curl/curl.h> GtkWidget *Bar; size_t my_write_func(void *ptr, size_t size, size_t nmemb, FILE *stream) { return fwrite(ptr, size, nmemb, stream); } size_t my_read_func(void *ptr, size_t size, size_t nmemb, FILE *stream) { return fread(ptr, size, nmemb, stream); } int my_progress_func(GtkWidget *bar, double t, /* dltotal */ double d, /* dlnow */ double ultotal, double ulnow) { /* printf("%d / %d (%g %%)\n", d, t, d*100.0/t);*/ gdk_threads_enter(); gtk_progress_set_value(GTK_PROGRESS(bar), d*100.0/t); gdk_threads_leave(); return 0; } void *my_thread(void *ptr) { CURL *curl; CURLcode res; FILE *outfile; gchar *url = ptr; curl = curl_easy_init(); if(curl) { const char *filename = "test.curl"; outfile = fopen(filename, "wb"); curl_easy_setopt(curl, CURLOPT_URL, url); curl_easy_setopt(curl, CURLOPT_WRITEDATA, outfile); curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, my_write_func); curl_easy_setopt(curl, CURLOPT_READFUNCTION, my_read_func); curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0L); curl_easy_setopt(curl, CURLOPT_PROGRESSFUNCTION, my_progress_func); curl_easy_setopt(curl, CURLOPT_PROGRESSDATA, Bar); res = curl_easy_perform(curl); fclose(outfile); /* always cleanup */ curl_easy_cleanup(curl); } return NULL; } int main(int argc, char **argv) { GtkWidget *Window, *Frame, *Frame2; GtkAdjustment *adj; /* Must initialize libcurl before any threads are started */ curl_global_init(CURL_GLOBAL_ALL); /* Init thread */ g_thread_init(NULL); gtk_init(&argc, &argv); Window = gtk_window_new(GTK_WINDOW_TOPLEVEL); Frame = gtk_frame_new(NULL); gtk_frame_set_shadow_type(GTK_FRAME(Frame), GTK_SHADOW_OUT); gtk_container_add(GTK_CONTAINER(Window), Frame); Frame2 = gtk_frame_new(NULL); gtk_frame_set_shadow_type(GTK_FRAME(Frame2), GTK_SHADOW_IN); gtk_container_add(GTK_CONTAINER(Frame), Frame2); gtk_container_set_border_width(GTK_CONTAINER(Frame2), 5); adj = (GtkAdjustment*)gtk_adjustment_new(0, 0, 100, 0, 0, 0); Bar = gtk_progress_bar_new_with_adjustment(adj); gtk_container_add(GTK_CONTAINER(Frame2), Bar); gtk_widget_show_all(Window); if(!g_thread_create(&my_thread, argv[1], FALSE, NULL) != 0) g_warning("can't create the thread"); gdk_threads_enter(); gtk_main(); gdk_threads_leave(); return 0; }
|
|
|
Post by vovchik on Jul 9, 2017 11:54:12 GMT 1
Dear Alex,
Thanks. Compiles with these mods:
* gcc curlgtk.c -w -o curlgtk -lgthread-2.0 -lglib-2.0 -lcurl `pkg-config --libs --cflags gtk+-2.0` */ #include <stdio.h> #include <gtk-2.0/gtk/gtk.h> #include <curl/curl.h>
And it needs a URL on the command line, which it subsequently dumps into test.curl. There is a deprecated g_threads function, but it seems to work nevertheless.
With kind regards, vovchik
|
|
|
Post by alexfish on Jul 9, 2017 15:49:00 GMT 1
Hi Vovchik
Have been testing the
https://duckduckgo.com/html/?= ...
Most other means require the java script.
for now , sticking with the html since getting reasonable results esp in searching for
==================================== BaCon Linux
most direct links come up https ==================================== now if change the search to
==================================== basic bacon linux
then can get links starting with http to parts of the site. Can strip these down to have link direct .. or to part of the site =========================================================================================
So here facing the ,to https or the htpp ... what can I say .
general thinking , if option to https , then depending on the return in curl = all or nothing, like in Your problem..
if the return is 0, then can drop guard and bypass , in bacon web site case : the solution above works,
Tested the GTK demo and all working, the important bit there is the
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0L); curl_easy_setopt(curl, CURLOPT_PROGRESSFUNCTION, my_progress_func);
If the CURLOPT_NOPROGRESS is not used before the CURLOPT_PROGRESSFUNCTION = O result
BR Alex
|
|
|
Post by alexfish on Jul 9, 2017 17:55:31 GMT 1
Hi vovchik was looking int the clean up RE cURL where the Global Init function is used found this example HERE#include <curl.h>
int main(void) { CURL *handle; CURLcode result;
int error = 0; int error2 = 0;
curl_global_init(CURL_GLOBAL_ALL); handle = curl_easy_init();
if(handle) { curl_easy_setopt(handle, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6"); curl_easy_setopt(handle, CURLOPT_URL, "http://www.google.com"); result = curl_easy_perform(handle);
if(result != CURLE_OK) { fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(result));
error++; }
Sleep(5000); // make a pause if you working on console application
curl_easy_reset(handle);
curl_easy_setopt(handle, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6"); // have to write it again curl_easy_setopt(handle, CURLOPT_URL, "http://www.bbc.com"); result = curl_easy_perform(handle);
if(result != CURLE_OK) { fprintf(stderr, "curl_easy_perform() failed: %s\n", curl_easy_strerror(result));
error2++; }
if(error == 1 || error2 == 1) { return 1; } } else { fprintf(stderr, "Curl init failed!\n");
return 1; }
curl_easy_cleanup(handle); curl_global_cleanup();
system("PAUSE");
return 0; }
|
|
|
Post by alexfish on Jul 30, 2017 20:02:45 GMT 1
Hi All Latest SeachEngine + http load & display, TEXT only Depends -- libcurl and libgtk-2 and html2text Usage -- enter text to search for in the entry > press return from results click on url (blue text) if want to view page in the entry box press return on the entry box click the last Page sequence displays the last web view / also works on fresh boot if a web view has been loaded = easy way to view last page Pages X X X also work on fresh boot if a search has been dowloaded all are saved in folder in home DIR 'my_bacon_web_results' hopeful next edition to have own parser for the html , ADDED : just tried to connect to this forum using libcurl get error yet using exec curl it works , any ideas this what the reply using libcurl [ProBoards] ****** Attention required! ****** The system has interpreted your activity or connection to this page as suspicious. If you believe you've received this message wrongly, this may simply be a false positive caused by an error on our end, and we apologize for the inconvenience.
Simply fill out the challenge below to regain access to the site you were attempting to browse.
[Submit]
For help, or to report any issues you're currently having, please visit the ProBoards_Support_Forum.
ProBoards ProBoards Policies Need Help? Create_a_Free_Forum Terms_of_Service Support_Forum Visit_our_Homepage Community_Guidelines Help_Guide Advertise_With_Us Privacy_Policy Server_Status
© 2013 ProBoards, Inc. ProBoards® is a registered trademark of ProBoards, Inc. BR Alex Attachments:curlduck.bac.bz2 (7.21 KB)
|
|
|
Post by vovchik on Jul 30, 2017 22:21:24 GMT 1
Dear Alex,
Thanks and nice work. It is working but I am wondering what this is:
(curlduck:25336): Gtk-WARNING **: Unable to show ' https://en.wikipedia.org/wiki/Semion_Mogilevich': Operation not supported
Does it have something to do with javascript?
With kind regards, vovchik
|
|