I have no computer science background (though I'm versed in HTML, Javascript, CSS, XML, PHP, SQL, etc.).
I want to programme a simple application that crawls a web page and looks for form inputs (text fields, radio buttons, submit buttons, etc.) and performs certain tasks like type in text, etc.
What programming language should I learn, and what compiler should I use?
I'd say python is a great language for doing what you suggested.
Perl is good too. They fall under the category of scripting languages - these are languages made primarily for scripting tasks like the one you described.
But since you know PHP, you can actually leverage your experience in that instead of picking a new language up. http://www.thecredence.com/creating-bots-spiders-and-crawlers-with-php/
Of course, if you want to write an app that does your task with the fastest speed, nothing can beat C++ (:
Thanks a lot for the information.
Mugging time!
Originally posted by RETARDED_MORON:Thanks a lot for the information.
Mugging time!
C++ is gd for speed.
PHP, on the other hand, is good for its libaries and API, basically ease-of-use.
For crawling... if you are leaving it to run, you have to look out for keywords in the HTML source. However, if the webpage is AJAX-based or the programmer is a f*cker like me who uses document.write in a loopy way, you are going to have a harder time crawling the page.
Originally posted by teraexa:C++ is gd for speed.
PHP, on the other hand, is good for its libaries and API, basically ease-of-use.
For crawling... if you are leaving it to run, you have to look out for keywords in the HTML source. However, if the webpage is AJAX-based or the programmer is a f*cker like me who uses document.write in a loopy way, you are going to have a harder time crawling the page.
I'm not sure if I should call this "crawling", but I need my program to be able to detect certain elements on an already-loaded web page and perform certain actions on those elements.
E.g. search for a web form text field and type in certain text within the field.
E.g. search for a web form submit button and click it.
E.g. search for a hyperlink and open it in a new window.
Should C++ still be the way to go? It doesn't sound like scripting languages can do these tasks, though I might be wrong.
Originally posted by RETARDED_MORON:I'm not sure if I should call this "crawling", but I need my program to be able to detect certain elements on an already-loaded web page and perform certain actions on those elements.
E.g. search for a web form text field and type in certain text within the field.
E.g. search for a web form submit button and click it.
E.g. search for a hyperlink and open it in a new window.
Should C++ still be the way to go? It doesn't sound like scripting languages can do these tasks, though I might be wrong.
This one is probably easy enough for PHP to do it.
No need to trouble C++.
Originally posted by teraexa:This one is probably easy enough for PHP to do it.
No need to trouble C++.
So how does one do those things with PHP?
Originally posted by RETARDED_MORON:
So how does one do those things with PHP?
It has to do with string manipulation. In this case, you need to create a new file handler, open the html file and read the html code from the file.
For example, if you are looking at finding radio buttons, look for the phrase ' type="radio" ' in between 2 <input> tags in the html code. There are thousand and one ways to consider how people will code the html for different forms but the general idea is the same.
I guess the best function that you can use in PHP is strstr(). Go look up on the documentation on how to do it. It's more than sufficient.
Of course, the ultimate spanner that can be thrown is when the person chooses to document.write() from a XMLHttpRequest call but that's out of the boundaries of what I can help you with.