Reality once again ruined my plans :) That’s what I’ve done since June, 15th:
My initial timeline had pair of packages which made me do all these packages.
Predictprotein package turned out to be a complex pipeline, which uses long list of other packages. There also was
proftmb - both programs by RostLab.
Predictprotein raised millions of errors, and some of these errors appeared because of packages it depends on. I thought that it would be good to write tests to other RostLab packages, since they are all connected (some of them are dependencies for others). That’s why I decided to write tests for all packages in this directory, and only after that to move forward.
I decided to skip some of them - for example,
pp-popularity-contest, since it doesn’t do anything biomedically significant, except sending usage reports to RostLab.
I also skipped
pssh2 (because couldn’t figure out for now how to get sources),
libai-fann-perl (moved to Debian Perl Group), and tried to do my best to fix as many errors as I can and write as many tests as I can.
When I was working on them, I learned about
autopkgtest-pkg-perl, which helped me a lot.
Some of these packages were written in fortran, and I was very grateful to my former scientific advisor for asking me to implement old folding algorithm in Scala - because of that I already knew, how fortran code may look like (algorithm’s parameters had readme file with small portions of fortran code) and I wasn’t afraid of it :)
The Top-1 Scariest Fortran Program in my personal scaryness rating is
profnet - this source package produces 8 binary packages. And for them I wrote 1 test:
This test requires binary package name as a parameter for execution, and it is ok since all mentioned 8 binary packages have similar structure.
Apparently, only 5 of 8 packages work well with this test. Other 3 end up with segmentation fault. I think they require some additional fixes or parameters, but I couldn’t find out what’s wrong and what parameters I should provide to run them. For now. That’s why test for
profnet is incomplete.
I haven’t fixed
profphd yet, since it requires old version of perl, and I don’t speak perl well enough yet to fix it.
Predictprotein appeared to be just worst of all. It requires ~30GB database, which should be installed by hand. And still it is outdated, and raises error. Because this is BLASTP database, outdated version.
blastpgp program (from
ncbi-blast+ package), and latest version of this program fails on that database.
That made me think a lot about typical problems with bioinformatics software - lack of standardization and database versioning.
blastpgp works well with latest BLASTP database from NCBI website.
I tried to run and it works! But I run by hand, and had patched profphd version installed (haven’t committed yet). And problem with database remains - probably I’ll try to make smaller version of NCBI database and use it to make testsuite for autopkgtest.
libgo-perl raised errors during
predictprotein run. That’s why I had to write test for these two and to fix errors.
The best thing is - I almost finished that nasty RostLab’s packages! Hope next week I could start working on