This was originally posted in French here.
There’s no string function in MySQL (and many other databases!) to help you find the number of occurrences of a string within another string. For example, how many times does « abc » appear in « abcbcbabcbacbcabcababcabacb » ?
I was asked this question on IRC a long time ago. Some poor soul was trying to find a particular subsequence in a genomic string (for instance « TAT ») in the following sequence :
Obviously, you can use your favorite programming language and do this kind of search programmatically but is there a way to do it in SQL?
Luckily, the answer is yes! The solution is simple and looks like this:
SELECT FLOOR(( LENGTH(source) - LENGTH(REPLACE(source, target, '')) ) / (LENGTH(target))) as occ
To come back to our example, « source » being the genomic sequence and « target » being « TAT », you’d have :
SELECT FLOOR(( LENGTH('ATTGGTGGGCTCTACTAAGATATCAACGGGACTTCGGAGCGTGCCGCACTATTT') - LENGTH(REPLACE('ATTGGTGGGCTCTACTAAGATATCAACGGGACTTCGGAGCGTGCCGCACTATTT', 'TAT', '')) ) / (LENGTH('TAT'))) as occ
Here’s the answer!
Fortunately, in life there are way more many solutions than problems! And sometimes, long SQL queries!