2009 October
Monday, October 19th, 2009
by Chris
I had a request today from a customer who wanted to make a list of products he was waiting for, and to be notified when Mighty Ape had them on the site ready to order. I decided that it was going to be problematic to implement this as a Mighty Ape feature, but I did help him sort out something that works just as well, using free tools available already.
Say you wanted to be notified when Mighty Ape makes the Transformers 2: Revenge Of the Fallen Blu-Ray available, here’s how you could go about doing it.
First, construct a Google query that narrows down what you’re after.
Since you’re interested in a product, make sure you add the keyword “site:mightyape.co.nz/product” to the query. That will restrict Google to search within that directory on our site.
We’re only interested in the Blu-Ray version so add that in there, too as “blu-ray“.
Now add some keywords that describe the product. Try and stay away from being too specific, filler words, numbers and punctuation. Something like “transformers revenge fallen“.
Now, using that might turn up some strange results, from other product’s descriptions or similar. Luckily, Mighty Ape stores the product name and format all in the page title, so we can direct Google to match only that by using the “allintitle:” keyword.
So put that all together and we have:
allintitle:transformers revenge fallen “blu-ray” site:mightyape.co.nz/product
Nice. Now all we need to do is put that into a new Google Alert and Google will let you know when new documents turn up matching your query in your E-Mail inbox, or RSS feed.
Enjoy! This can easily be adapted to other sites or other queries. Anybody got any other cool examples of useful Google Alerts?
Tuesday, October 13th, 2009
by Chris
Ars.Technica has a fantastic article up titled “100 years of Big Content fearing technology—in its own words“.
The article basically touches over most major technological advances over the last centry, and digs up what “Big Content” had to say about it at the time. Big Content being Music, Film and Print industries.
Such business-busters as the Xerox machine:
“the day may not be far off when no one need purchase books”
The VCR:
We are going to bleed and bleed and hemorrhage, unless this Congress at least protects one industry that is able to retrieve a surplus balance of trade and whose total future depends on its protection from the savagery and the ravages of this machine.
And of course, MP3:
We’ve come full circle here, as this is the inverse of Sousa; a new technology won’t eliminate the amateurs, it will eliminate all the professionals and leave nothing but amateurs.
Of course, this seems all too familiar. The roadblocks and excuses being thrown up today are the same that were spun up all those years ago, those which failed to come to fruition. The same things are being threatened – the death of industries, the loss of jobs and the destruction of dreams.
However, what can be said of all of these advances is that they caused change. They caused a reshuffle of the industry in question, but eventually everything settled down again to get back to what they were doing: making money. In fact, if I remember correctly, most content industries are in boom, some making more money than they ever had (with the exception of newspapers, but I feel they’re the author of their own demise).
So “Big Content” is quite comfortable where it is, thank you very much, and is apparently quite happy to fight tooth and nail against anything that may cause change. The problem is, these industries have a lot of teeth and too many nails. We have the huge lobby groups, the MP/RIAA and the very prominent spokespeople with their voice in the ear of our politicians.
This is why so many people keep an eye on the laws being passed in our names, on behalf of these industries. We need to watch whats going on in the fields of Copyright and Intellectual Property (don’t even get me started about software patents) before we regulate ourselves out of evolution.
Tuesday, October 13th, 2009
by Chris
Was working on a problem today that required a lot of bulk inserts to a MySQL table. I was getting about 200 inserts/second on my development system, which is OK considering there was some minor processing going on.
Since some batches can contain well over a million rows, I started working on how to optimise these queries so we can get them in there faster.
Firstly, watch your indexes on the table. Obviously the more indexes you have on the table the more work the DB has to do on INSERT to maintain them.
Second, if possible, ditch your ORM. Instead of building and hydrating objects for each row in the table used direct/prepared queries. Most ORMs worth their salt can handle this. i.e. Propel can give you direct access to the underlying PDO connection to use your prepared statements.
/** @var $con PDO */
$con = Propel::getConnection();
Obviously if you have business logic tied up in your objects, it’s best to use them instead of duplicating code.
Third, lock the table for a batch of inserts. Here’s some sample code
$i = 0;
$con = Propel::getConnection;
$con->beginTransaction();
foreach ($rowsToInsert as $row) {
if ($i % 100 == 0) {
$con->commit();
$con->beginTransaction();
}
TablePeer::insertUsingPreparedStatement($row);
$i++;
}
If your using MyISAM you can lock the table using the LOCK TABLE statement
$con->exec('LOCK TABLES table_name WRITE');
$con->exec('UNLOCK TABLES');
Depending on how important your data is you can change the (100) value in the $i % conditional to higher or lower. The reason this speeds up Inserts so much is that MySQL won’t flush it’s write cache to the disk until the transaction is finished, as opposed to every INSERT statement. However, having unflushed data in your cache is dangerous because it may disappear if something happens to the DB server, or get rolled back if your script carks it. Also, since a transaction/lock table call will stop all other access to the table, if it’s frequently read from those queries will be waiting on locks, so it’s good to refresh them frequently.
Using these three methods I almost tripled the performance of my script which now inserts between 550-600 rows per second. Win!
There’s some more tips over here in the MySQL manual and some of the comments are quite helpful too.
Tuesday, October 13th, 2009
by Chris
Uh, yeah, back from my blog-free stupor.