PHP/MariaDB...not sure where to even start...

So I have a massive karaoke database with over 30,000 records. Now here’s the tricky part…

A lot of the records (about 5,000 or so) are formatted oddly, for example

Last Name, First Name (ie. Petty, Tom) to Tom Petty

  • or - Rolling Stones, The to The Rolling Stones

Now I need to swap these values, without going through every single record and changing it manually, or going through and doing a query for every single artist (there are a lot of records where an artist has maybe 3-4 songs).

Is there anyway to develop a simple SQL query for use with either PHPMyAdmin or a PHP PDO query that can take care of this in one swing? I also need to do the same thing for the song titles. I apologize if this is in the wrong place, or seems really really difficult. I just don’t have the time to go through each and every record.

You would need to determine the common denominator of what needs to be the catalyst to make the change first.

This would be an etl process, but is doable.

I figured it would end up being an ETL solution, but again. I have no idea where to even start. This is the most complex query i’ve had to draw up.

Odds are against being able to do this in one sweep, unless the same issue occurred with all of the troubled records.

Clarify the example for me, what are the records like now, compared to how they should be stored.

Also, is this for this same database, http://www.phphelp.com/forum/mysql-database/incremental-search/msg93481/#msg93481?

Here’s an example of the table as it stands.
Imgur

How common is a name entered in correctly like ‘The Breeders, The’?

This basic process will fix things a great deal, I didn’t do it as a stored procedure, as I didn’t know if this would be an ongoing issue:

[php]$stmt = $pdo->prepare(“SELECT sID, sArtist FROM songs WHERE sArtist LIKE ‘%,%’”);
$stmt->execute();
$result = $stmt->fetchAll(PDO::FETCH_ASSOC);

foreach ($result as $k) {
$parts = explode(’,’, $k[‘sArtist’]);
$nArtist = “{$parts[1]} {$parts[0]}”;
echo “

Updating: $nArtist

”;
$stmt = $pdo->prepare(“UPDATE songs SET sArtist = ? WHERE sID = ?”);
$stmt->execute([
$nArtist,
$k[‘sID’]
]);
}[/php]

First off I’d suggest to change the database structure in the same go. You are repeating a lot of data and would have a much easier time fixing this if artist was a foreign key to a separate artists table.

[code]artist
id, name, etc
1, “The Breeders, The”
2, “Commitments, The”
3, “Confederate Railroad, The”

record
id, artist_id, title
2323, 1, Cannonball
4251, 2, Chain Of Fools
4252, 2, Gimme Some Lovin’
4253, 2, I Can’t Stand The Rain
4254, 2, In The Midnight Hour[/code]

Having it structured in a normalized way means you can go edit the artist name in one place and fix all other entries where that artist is used.

[hr]

30 000 records isn’t much. I’d just create a conversion script and run it.

[php]<?php

$host = ‘127.0.0.1’;
$db = ‘test’;
$user = ‘root’;
$pass = ‘’;
$charset = ‘utf8’;

$increment = 200; // how many records to fetch on each page load, adjust if the server times out

$dsn = “mysql:host=$host;dbname=$db;charset=$charset”;
$opt = [
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
PDO::ATTR_EMULATE_PREPARES => false,
];
$pdo = new PDO($dsn, $user, $pass, $opt);

$start = isset($_GET[‘start’]) ? $_GET[‘start’] : 0;

$stmt = $pdo->query(“SELECT id, artist FROM table LIMIT $start,$increment”);
$records = $stmt->fetchAll();

if (count($records) == 0)
{
die(‘done’);
}

$pdo->prepare(“UPDATE table SET artist = ? WHERE id = ?”);

foreach ($records as $record)
{
if ($record[‘artist’] == ‘blah’) // needs logic to find what you would consider a wrong artist name
{
$artist= ‘foo’; // needs logic to fix the name in $record[‘artist’]
$pdo->execute([$artist, $record[‘id’]]);
}
}

$start = $start+$increment;
header(“Location: $_SERVER[‘PHP_SELF’] ?start=$start”);
die();[/php]

This file will fetch the first 200 records, look through them for the error(s) you’ve added, and if needed update the entries. Then reload the page to do the operation on the next 200 records.

For 30.000 records this shouldn’t take long.

+1 on the Normalization, but I believe we have had that discussion before.

Good standards can never be mentioned too many times :slight_smile:

Thank you. I will give this a try tomorrow and see what I can get from it (since it’s quite late here on the east coast [USA]) and I was not near my desk. I will report back any issues I may find, or if this resolves my issue.

I believe I originally had asked someone here once about this with another project and it just seems to keep coming up. The data was given to me in an excel format for this database and was formatted as is. In the future any additions to it will be formatted correctly.

How much work do you think it would take to update the code base to allow for a normalized database? Another script could be run to parse the current table and split into a new ‘artist’ table and drop their respective id’s into the songs table.

Sponsor our Newsletter | Privacy Policy | Terms of Service