Apprendre à composer avec les particularités de la valeur NULL en SQL, c’est parfois chiant et difficile. Mais c’est possible, quand on sait comment s’y prendre et qu’on évite les pièges. Tel qu’expliqué ici.
We all know that guy. You know, that guy who sees performance improvements everywhere, all the time?
That programmer who squeezes everything into bitmaps because « it’s so much faster » ? And when I say everything, I mean everything! Even if your class only has one flag!
That guy who caches everything in the application because « that’s the optimal way to do it« ? Even if the data that is cached is just accessed only one time?
That guy who fits a date into an integer in the database because « it’s so much more compact » ? There goes all your SQL and date functions!
That developer who always ends up re-implementing another sorting algorithm because he read a « great paper » on the subject and « it’s proven that it’s 0.2% faster than the default sort » available? And after an insane debugging session, you finally realize he has overriden SortedCollection>>#sort: ? And that his code just doesn’t work properly!
You know, that guy with a C/C++ background who spends countless hours optimizing everything not realizing he has to « make it work » first! You know, that guy who still doesn’t get that often times you only need to optimize very small parts of an application to make a real difference?
You know that guy with strange concepts such as « defensive programming » who tells you nobody ever caught a bug in his code in production? You know, that guy who came up with the « clever » catch-everything method #ifTrue:ifFalse:ifNil:onError:otherwise: ?
You know that guy who never works for more than 4 months at the same place because « they didn’t get it, they’re a bunch of morons » ?
You know, that guy who prefers to implement complex database queries with a #do: loop and a gazillion SELECT statement because « SELECT statements with a 1-row lookup are very optimized by the database server » instead of using JOINs. And then he blames the slow response time of his « highly optimized » data retrieval code on the incompetence of the DBAs maintaining the database?
You know, that guy who once told me « inheritance in Smalltalk is very bad because the deeper the class in the hierarchy, the slower the method lookup is going to be » so that’s why he always preferred « flattened hierarchies » (meaning promoting ALL instance variables into one root class) with 2-3 levels deep in the worst case? « Besides, my code is easier to understand and debug since everything is in one place« .
Well, I was going through some of the sh*ttiest code I’ve seen in a long time last night and I remembered that guy and his ideas about « highly optimized flattened hierarchies » and thought I’d measure his theory! Here’s the script.
Basically, it creates a hierarchy of classes (10000 subclasses) and times message sends from the top class and the bottom class to measure the difference.
Well, that guy was right… There’s a 1 millisecond difference over 9 million message sends from the root class as compared to the 10000th class at the bottom of the hierarchy.
I wanted to tell that guy he was right but his email address at work is probably no longer valid…
CAVEAT: Make sure you do not have a category with the same name as the one in the script because the script removes the category at the end (and all classes in it!). Also, be warned it takes a very very very long time to execute!
P.S. All the stories above are real. No f*cking kidding! Those guys really existed : I’ve worked with them!
Un excellent résumé qui fait un tour d’horizon des séquences (et des particularités de chacunes) pour les bases de données les plus populaires.
For those interested, I have released the very first version of my database of chess games, the CGR database! All details here!
Some of you may already know *lots* of MongoDB servers are compromised as we speak. That’s sad. But in a way, it’s not. It’s just the very predictable consequence of sheer incompetence. Here’s a reminder on the very basics of database server security.
As you can see, Freewill now supports 17 different selection policies. At this point, all of them are coded but only half of them have been tested.
The 11 available termination policies are coded, half of them tested.
So far, only 2 mutation policies are available. Both of them are coded and tested. I will probably need a few extras for TSP type of problems as well as numerically parametrized problems (e.g. De Jong functions with a domain for each variable). I’ll probably add 3-4 other ones specific to the problem that started all this adventure!
Only one immigration policy (no immigration!) is available and it will stay that way for a long time. I’ll wait until I am hyper confident that this framework is rock solid before introducing parallelism and exchange of individuals between « islands » (i.e. simulations). This one is a faaaaaaar away!
Six crossover policies are available as of now . This area will require some (minor I think at first glance) changes for the TSP type of problems : not quite decided on the approach I will take to solve this. Since crossover is often very problem/chromosome specific, I’ll probably delay those change until the end, once I have all examples coded and ready to be tested to have a better idea of what is needed. But I will definitely add a few (3-4) crossover policies tailored for the Ruzzle problem.
I have solved the discrepancy (see here and here) between my results and the TSPLIB ones regarding the tour length of the Burma14 problem. Will probably add a lot bigger TSP problem to see how the framework can handle an extremely huge search space! Oh! And I need to clean up all the crap I added/modified while looking for the problem of « distance difference » : 2 classes were butchered in the process!
I need to add a few « crash test dummy » classes to test all those different selection policies (and crossover) in a simpler and more efficient manner! Or I should kick myself in the %*&#$!@ and code the « bits » example classes…
I will soon work on a customizable display of statistics. All that’s needed is already there, it’s just a matter of gluing everything together!
Once I’m done with the 8 queens problems, I’ll attack the numerically parametrized problems. Will probably have 2-3 examples (from De Jong functions) as well as the INSANE Griewank function.
The classes used for randomly choosing the next parent chromosomes as well as scaling/ranking can be optimized. But since they just work great since day 1, I’ll keep that for the very end. But I know they can be a lot faster than what they are right now.
I also plan on having a very basic export mechanism so I can dump all those ruzzle chromosomes in a MySQL database to be able to do some reporting and study the various policies and their effects.
I started adding comments to the classes, mostly to keep references, maintain a todo list per class and add some notes for myself to quickly remember why things work that way!
I’ll probably have an image by tomorrow that will run simulations for the ruzzle problem full-time. I wanna beat that record!