This will be a VERY LONG explanation, so if you can, grab a drink of some sort. However, by the end of it, you’ll have a working scraper.
What you are trying to do basically is data scraping. You asked whether it is possible to generate an URL - I’m going to do better than this and show you how to actually do the data scraping in PHP so you don’t even have to use your web browser.
To do this, you’ll need:
- a PHP webserver with PHP>5.1
- the PHP cURL extension/library with full support OR stream wrappers (first one is recommended, it will allow you to use proxies should you need to)
Please note that I make a living writing scrapers for people. As a result, this is basically second-nature, which means that I may brush over things that might confuse you. If anything is unclear, feel free to say so and I’ll elaborate.
Please note as well that this is not an exact science!
Part 1: Generating the numbers
We’ll write a quick generator. Nothing fancy - two letters - two numbers - three letters.
[php]function generatePlate() {
$format = “aaddaaa”;
$fl = strlen($format);
$plate = “”;
for ($i = 0; $i < $fl; $i++) {
if ($format[$i] == “a”) {
$plate .= chr(65 + rand(0,25));
}
else {
$plate .= chr(48 + rand(0,9));
}
}
return $plate;
}[/php]
This outputs numberplates! Woo! Logic is pretty simple - the chr() function prints out a character based on the ASCII table. 48 is the first digit, 65 the first uppercase letter.
With this written, we can go on to…
Part 2: Scraping!
Scraping is actually trivial in this case. We’re simply going to look for the keyword “non-existant”. Let’s have some fun! I’m going to use stream wrappers
[php]function check($numberplate) {
$wr = file_get_contents(“http://cedam.csa-isc.ro/search.php?c=nr_inmatriculare&v=".$numberplate."&zi=07&luna=05&an=2013”);
if (!empty($wr)) {
if (stripos($wr,“non-existant”)) return false;
else return true;
}
}[/php]
Pretty simple, huh? That’s it. If you’d like to test it, here are a few test cases:
- AR18AUG is covered by an insurance
- So is MM08TCI
This has a drawback - it does not cover the old plate format (ADDAAA). It’s easy to adapt the code, though.